Why is the output of the folowing code 543 instead of 345? [duplicate] - c

In C and C++, is there a fixed order for evaluation of parameter to the function? I mean, what do the standards say? Is it left-to-right or right-to-left?
I am getting confusing information from the books.
Is it necessary that function call should be implemented using stack only? What do the C and C++ standards say about this?

C and C++ are two completely different languages; don't assume the same rules always apply to both. In the case of parameter evaluation order, however:
C99:
6.5.2.2 Function calls
...
10 The order of evaluation of the function designator, the actual arguments, and
subexpressions within the actual arguments is unspecified, but there is a sequence point
before the actual call.
[Edit]
C11 (draft):
6.5.2.2 Function calls
...
10 There is a sequence point after the evaluations of the function designator and the actual
arguments but before the actual call. Every evaluation in the calling function (including
other function calls) that is not otherwise specifically sequenced before or after the
execution of the body of the called function is indeterminately sequenced with respect to
the execution of the called function.94)
...
94) In other words, function executions do not ‘‘interleave’’ with each other.
C++:
5.2.2 Function call
...
8 The order of evaluation of arguments is unspecified. All side effects of argument expression evaluations take effect
before the function is entered. The order of evaluation of the postfix expression and the argument expression list is
unspecified.
Neither standard mandates the use of the hardware stack for passing function parameters; that's an implementation detail. The C++ standard uses the term "unwinding the stack" to describe calling destructors for automatically created objects on the path from a try block to a throw-expression, but that's it. Most popular architectures do pass parameters via a hardware stack, but it's not universal.
[Edit]
I am getting confusing information from the books.
This is not in the least surprising, since easily 90% of books written about C are simply crap.
While the language standard isn't a great resource for learning either C or C++, it's good to have handy for questions like this. The official™ standards documents cost real money, but there are drafts that are freely available online, and should be good enough for most purposes.
The latest C99 draft (with updates since original publication) is available here. The latest pre-publication C11 draft (which was officially ratified last year) is available here. And a publicly availble draft of the C++ language is available here, although it has an explicit disclaimer that some of the information is incomplete or incorrect.

Keeping it safe: the standard leaves it up to the compiler to determine the order in which arguments are evaluated. So you shouldn't rely on a specific order being kept.

In C/C++ is there a fixed order for evaluation of parameter to the function. I mean what does standards says is it left-to-right or right-to-left . I am getting confusing information from the books.
No, the order of evaluation of function parameters (and of two sub-expressions in any expression) is unspecified behaviour in C and C++. In plain English that means that the left-most parameter could be evaluated first, or it could be the right-most one, and you cannot know which order that applies for a particular compiler.
Example:
static int x = 0;
int* func (int val)
{
x = val;
return &x;
}
void print (int val1, int val2)
{
cout << val1 << " " << val2 << endl;
}
print(*func(1), *func(2));
This code is very bad. It relies of order of evaluation of print's parameters. It will print either "1 1" (right-to-left) or "2 2" (left-to-right) and we cannot know which. The only thing guaranteed by the standard is that both calls to func() are completed before the call to print().
The solution to this is to be aware that the order is unspecified, and write programs that don't rely on the order of evaluation. For example:
int val1 = *func(1);
int val2 = *func(2);
print(val1, val2); // Will always print "1 2" on any compiler.
Is it necessary that function call should be implemented using stack only. what does C/C++ standards says about this.
This is known as "calling convention" and nothing that the standard specifies at all. How parameters (and return values) are passed, is entirely up to the implementation. They could be passed in CPU registers or on the stack, or in some other way. The caller could be the one responsible for pushing/popping parameters on the stack, or the function could be responsible.
The order of evaluation of function parameters is only somewhat associated with the calling convention, since the evaluation occurs before the function is called. But on the other hand, certain compilers can choose to put the right-most parameter in a CPU register and the rest of them on the stack, as one example.

Just only to speak for C language, the order of evaluation inside the function parameters depend on compiler. from The C Programming Language of Brian Kernighan and Dennis Ritchie;
Similarly, the order in which function arguments are evaluated is not
specified, so the statement
printf("%d %d\n", ++n, power(2, n)); /*WRONG */
can produce different results with different compilers,
depending on whether n is incremented before power is called. The
solution, of course, is to write
++n;
printf("%d %d\n", n, power(2, n));

As far as I know, the function printf has an exception here.
For an ordinary function foo, the order of evaluation of foo(bar(x++), baz(++x)) is undefined. Correct! However printf, being an ellipsis function has a somewhat more definite order of evaluation.
The printf in the standard library, in fact, has no information about the number of arguments that has been sent to. It just tries to figure out the number of variables from the string placeholders; namely, from the number of percent operators (%) in the string. The C compiler starts pushing arguments from the very right towards left; and the address of the string is passed as a last argument. Having no exact information about the number of arguments, printf evaluates the last address (the string) and starts replacing the %'s (from left to right) with values of the corresponding addresses in the stack. That is, for a printf like below;
{
int x = 0;
printf("%d %d %f\n", foo(x), bar(x++), baz(++x));
}
The order of evaluation is:
x is incremented by 1,
function baz is called with x = 1; return value is pushed to the stack,
function bar is called with x = 1; return value is pushed to the stack,
x is incremented by 1,
function foo is called with x = 2; return value is pushed to the stack,
the string address is pushed to the stack.
Now, printf has no information about the number of arguments that has been sent to. Moreover, if -Wall is not issued in compilation, the compiler will not even complain about the inconsistent number of arguments. That is; the string may contain 3 %'s but the number of arguments in the printf line can be 1, 2, 3, 4, 5 or even can contain just the string itself without any arguments at all.
Say, the string has 3 placeholders and you've send 5 arguments (printf("%d %f %s\n", k1, k2, k3, k4, k5)). If compilation warning is turned off, the compiler will not complain about excessive number of arguments (or insufficient number of placeholders). Knowing the stack address, printf will;
treat the first integer width in the stack and print it as an integer (without knowing whether it is an integer or not),
treat the next double width in the stack and print it as a double (without knowing whether it is a double or not),
treat the next pointer width in the stack as a string address and will try to print the string at that address, until it finds a string terminator (provided, the address pointed to belong to that process; otherwise raises segmentation error).
and ignore the remaining arguments.

Related

Sequence points and I/O format Specifiers in C

The C standard (Annex C) states that there is a sequence point
After the actions associated with each formatted input/output
function conversion specifier.
Given that, why am I getting an unsequenced modification and access to i (clang) warning for the below?
int i = 0;
printf("%d, %d\n", i, ++i);
Based on the standard, there is a sequence point after the first and the second %d. If so I should be getting a 0 1? But then there is no ordering guarantee in the evaluation of function arguments and I could be getting 1 1 instead?
So what does the text of the standard I quoted really mean?
Given that, why am I getting an unsequenced modification and access to
i (clang) warning for the below?
int i = 0;
printf("%d, %d\n", i, ++i);
Because the problem happens when evaluating the function's argument list, before the function is actually called. The evaluations of multiple arguments to the same function call are not sequenced with respect to each other, and you both read and modify i via separate, unsequenced arguments.
The provision you cited is not relevant to this issue. It describes sequence points between the I/O operations performed by the function when it executes. Because function arguments are always passed by value, and because there is a sequence point between evaluating the arguments and executing the function body, I don't see any practical relevance of that provision to the printf-family functions.
For scanf & friends, however, the provision helps ensure that
int i;
scanf("%d, %d", &i, &i);
has well-defined behavior, because it specifies that the two resulting writes to i are sequenced with respect to each other.

Array of Integers Changes Content When Passed to a Function [duplicate]

In C and C++, is there a fixed order for evaluation of parameter to the function? I mean, what do the standards say? Is it left-to-right or right-to-left?
I am getting confusing information from the books.
Is it necessary that function call should be implemented using stack only? What do the C and C++ standards say about this?
C and C++ are two completely different languages; don't assume the same rules always apply to both. In the case of parameter evaluation order, however:
C99:
6.5.2.2 Function calls
...
10 The order of evaluation of the function designator, the actual arguments, and
subexpressions within the actual arguments is unspecified, but there is a sequence point
before the actual call.
[Edit]
C11 (draft):
6.5.2.2 Function calls
...
10 There is a sequence point after the evaluations of the function designator and the actual
arguments but before the actual call. Every evaluation in the calling function (including
other function calls) that is not otherwise specifically sequenced before or after the
execution of the body of the called function is indeterminately sequenced with respect to
the execution of the called function.94)
...
94) In other words, function executions do not ‘‘interleave’’ with each other.
C++:
5.2.2 Function call
...
8 The order of evaluation of arguments is unspecified. All side effects of argument expression evaluations take effect
before the function is entered. The order of evaluation of the postfix expression and the argument expression list is
unspecified.
Neither standard mandates the use of the hardware stack for passing function parameters; that's an implementation detail. The C++ standard uses the term "unwinding the stack" to describe calling destructors for automatically created objects on the path from a try block to a throw-expression, but that's it. Most popular architectures do pass parameters via a hardware stack, but it's not universal.
[Edit]
I am getting confusing information from the books.
This is not in the least surprising, since easily 90% of books written about C are simply crap.
While the language standard isn't a great resource for learning either C or C++, it's good to have handy for questions like this. The official™ standards documents cost real money, but there are drafts that are freely available online, and should be good enough for most purposes.
The latest C99 draft (with updates since original publication) is available here. The latest pre-publication C11 draft (which was officially ratified last year) is available here. And a publicly availble draft of the C++ language is available here, although it has an explicit disclaimer that some of the information is incomplete or incorrect.
Keeping it safe: the standard leaves it up to the compiler to determine the order in which arguments are evaluated. So you shouldn't rely on a specific order being kept.
In C/C++ is there a fixed order for evaluation of parameter to the function. I mean what does standards says is it left-to-right or right-to-left . I am getting confusing information from the books.
No, the order of evaluation of function parameters (and of two sub-expressions in any expression) is unspecified behaviour in C and C++. In plain English that means that the left-most parameter could be evaluated first, or it could be the right-most one, and you cannot know which order that applies for a particular compiler.
Example:
static int x = 0;
int* func (int val)
{
x = val;
return &x;
}
void print (int val1, int val2)
{
cout << val1 << " " << val2 << endl;
}
print(*func(1), *func(2));
This code is very bad. It relies of order of evaluation of print's parameters. It will print either "1 1" (right-to-left) or "2 2" (left-to-right) and we cannot know which. The only thing guaranteed by the standard is that both calls to func() are completed before the call to print().
The solution to this is to be aware that the order is unspecified, and write programs that don't rely on the order of evaluation. For example:
int val1 = *func(1);
int val2 = *func(2);
print(val1, val2); // Will always print "1 2" on any compiler.
Is it necessary that function call should be implemented using stack only. what does C/C++ standards says about this.
This is known as "calling convention" and nothing that the standard specifies at all. How parameters (and return values) are passed, is entirely up to the implementation. They could be passed in CPU registers or on the stack, or in some other way. The caller could be the one responsible for pushing/popping parameters on the stack, or the function could be responsible.
The order of evaluation of function parameters is only somewhat associated with the calling convention, since the evaluation occurs before the function is called. But on the other hand, certain compilers can choose to put the right-most parameter in a CPU register and the rest of them on the stack, as one example.
Just only to speak for C language, the order of evaluation inside the function parameters depend on compiler. from The C Programming Language of Brian Kernighan and Dennis Ritchie;
Similarly, the order in which function arguments are evaluated is not
specified, so the statement
printf("%d %d\n", ++n, power(2, n)); /*WRONG */
can produce different results with different compilers,
depending on whether n is incremented before power is called. The
solution, of course, is to write
++n;
printf("%d %d\n", n, power(2, n));
As far as I know, the function printf has an exception here.
For an ordinary function foo, the order of evaluation of foo(bar(x++), baz(++x)) is undefined. Correct! However printf, being an ellipsis function has a somewhat more definite order of evaluation.
The printf in the standard library, in fact, has no information about the number of arguments that has been sent to. It just tries to figure out the number of variables from the string placeholders; namely, from the number of percent operators (%) in the string. The C compiler starts pushing arguments from the very right towards left; and the address of the string is passed as a last argument. Having no exact information about the number of arguments, printf evaluates the last address (the string) and starts replacing the %'s (from left to right) with values of the corresponding addresses in the stack. That is, for a printf like below;
{
int x = 0;
printf("%d %d %f\n", foo(x), bar(x++), baz(++x));
}
The order of evaluation is:
x is incremented by 1,
function baz is called with x = 1; return value is pushed to the stack,
function bar is called with x = 1; return value is pushed to the stack,
x is incremented by 1,
function foo is called with x = 2; return value is pushed to the stack,
the string address is pushed to the stack.
Now, printf has no information about the number of arguments that has been sent to. Moreover, if -Wall is not issued in compilation, the compiler will not even complain about the inconsistent number of arguments. That is; the string may contain 3 %'s but the number of arguments in the printf line can be 1, 2, 3, 4, 5 or even can contain just the string itself without any arguments at all.
Say, the string has 3 placeholders and you've send 5 arguments (printf("%d %f %s\n", k1, k2, k3, k4, k5)). If compilation warning is turned off, the compiler will not complain about excessive number of arguments (or insufficient number of placeholders). Knowing the stack address, printf will;
treat the first integer width in the stack and print it as an integer (without knowing whether it is an integer or not),
treat the next double width in the stack and print it as a double (without knowing whether it is a double or not),
treat the next pointer width in the stack as a string address and will try to print the string at that address, until it finds a string terminator (provided, the address pointed to belong to that process; otherwise raises segmentation error).
and ignore the remaining arguments.

How does this program duplicate itself?

This code is from Hacker's Delight. It says this is the shortest such program in C and is 64 characters in length, but I don't understand it:
main(a){printf(a,34,a="main(a){printf(a,34,a=%c%s%c,34);}",34);}
I tried to compile it. It compiles with 3 warnings and no error.
This program relies upon the assumptions that
return type of main is int
function's parameter type is int by default and
the argument a="main(a){printf(a,34,a=%c%s%c,34);}" will be evaluated first.
It will invoke undefined behavior. Order of evaluation of arguments of a function is not guaranteed in C.
Albeit, this program works as follows:
The assignment expression a="main(a){printf(a,34,a=%c%s%c,34);}" will assign the string "main(a){printf(a,34,a=%c%s%c,34);}" to a and the value of the assignment expression would be "main(a){printf(a,34,a=%c%s%c,34);}" too as per C standard --C11: 6.5.16
An assignment operator stores a value in the object designated by the left operand. An assignment expression has the value of the left operand after the assignment [...]
Taking in mind the above semantic of assignment operator the program will be expanded as
main(a){
printf("main(a){printf(a,34,a=%c%s%c,34);}",34,a="main(a){printf(a,34,a=%c%s%c,34);}",34);
}
ASCII 34 is ". Specifiers and its corresponding arguments:
%c ---> 34
%s ---> "main(a){printf(a,34,a=%c%s%c,34);}"
%c ---> 34
A better version would be
main(a){a="main(a){a=%c%s%c;printf(a,34,a,34);}";printf(a,34,a,34);}
It is 4 character longer but at least follows K&R C.
It relies on several quirks of the C language and (what I think is) undefined behavior.
First, it defines the main function. It is legal to declare a function without a return type or parameter types, and they will be presumed to be int. This is why the main(a){ part works.
Then, it calls printf with 4 parameters. Since it has no prototype, it is assumed to return int and accept int parameters (unless your compiler implicitly declares it otherwise, like Clang does).
The first parameter is presumed int and is argc at the beginning of the program. The second parameter is 34 (which is ASCII for the double-quote character). The third parameter is an assignment expression that assigns the format string to a and returns it. It relies on a pointer-to-int conversion, which is legal in C. The last parameter is another quote character in numeric form.
At runtime, the %c format specifiers are substituted with quotes, the %s is substituted with the format string, and you get the original source again.
As far as I know, the order of argument evaluation is undefined. This quine works because the assignment a="main(a){printf(a,34,a=%c%s%c,34);}" is evaluated before a is passed as the first parameter to printf, but as far as I know, there is no rule to enforce it. Additionally, this can't work on 64-bit platforms because the pointer-to-int conversion will truncate the pointer to a 32-bit value. As a matter of fact, even though I can see how it works on some platforms, it doesn't work on my computer with my compiler.
This works based on lots of quirks that C allows you to do, and some undefined behavior that happens to work in your favor. In order:
main(a) { ...
Types are assumed to be int if unspecified, so this is equivalent to:
int main(int a) { ...
Even though main is supposed to take either 0 or 2 arguments, and this is undefined behavior, this can be allowed as just ignoring the missing second argument.
Next, the body, which I will space out. Note that a is an int as per main:
printf(a,
34,
a = "main(a){printf(a,34,a=%c%s%c,34);}",
34);
The order of evaluation of arguments is undefined, but we're relying on the 3rd argument - the assignment - getting evaluated first. We're also relying on the undefined behavior of being able to assign a char * to an int. Also, note that 34 is the ASCII value of ". Thus, the intended impact of the program is:
int main(int a, char** ) {
printf("main(a){printf(a,34,a=%c%s%c,34);}",
'"',
"main(a){printf(a,34,a=%c%s%c,34);}",
'"');
return 0; // also left off
}
Which, when evaluated, produces:
main(a){printf(a,34,a="main(a){printf(a,34,a=%c%s%c,34);}",34);}
which was the original program. Tada!
The program is supposed to print its own code. Note the similarity of the string literal to the overall program code. The idea is that the literal will be used as the printf() format string because its value is assigned to variable a (albeit in the argument list) and that it will also be passed as the string to print (because an assignment expression evaluates to the value that was assigned). The 34 is the ASCII code for the double quote character ("); using it avoids a format string containing escaped literal quotation mark characters.
The code relies on unspecified behavior in the form of the order of evaluation of the function arguments. If they are evaluated in argument list order then the program is likely to fail because the value of a would then be used as a pointer to the format string before the correct value was actually assigned to it.
Additionally, the type of a defaults to int, and there is no guarantee that int is wide enough to hold an object pointer without truncating it.
Furthermore, the C standard specifies only two permitted signatures for main(), and the signature used is not among them.
Moreover, the type of printf() inferred by the compiler in the absence of a prototype is incorrect. It is by no means guaranteed that the compiler will generate a calling sequence that works for it.

Calling Convention Confusion [duplicate]

This question already has answers here:
Closed 13 years ago.
Possible Duplicate:
Could anyone explain these undefined behaviors (i = i++ + ++i , i = i++, etc…)
I'm not able to understand the output of this program (using gcc).
main()
{
int a=10;
printf("%d %d %d\n",++a, a++,a);
}
Output:
12 10 12
Also, please explain the order of evaluation of arguments of printf().
The compiler will evaluate printf's arguments in whatever order it happens to feel like at the time. It could be an optimization thing, but there's no guarantee: the order they are evaluated isn't specified by the standard, nor is it implementation defined. There's no way of knowing.
But what is specified by the standard, is that modifying the same variable twice in one operation is undefined behavior; ISO C++03, 5[expr]/4:
Between the previous and next sequence point a scalar object shall have its stored value modified at most once by the evaluation of an expression. Furthermore, the prior value shall be accessed only to determine the value to be stored. The requirements of this paragraph shall be met for each allowable ordering of the subexpressions of a full expression; otherwise the behavior is undefined.
printf("%d %d %d\n",++a, a++,a); could do a number of things; work how you expected it, or work in ways you could never understand.
You shouldn't write code like this.
AFAIK there is no defined order of evaluation for the arguments of a function call, and the results might vary for each compiler. In this instance I can guess the middle argument was first evaluated, following by the first, and the third.
As haggai_e hinted, the parameters are evalueted in this order: middle, left, right.
To fully understand why these particular numbers are showing up, you have to understand how the increment works.
a++ means "do something with a, and then increment it afterwards".
++a means "increment a first, then do something with the new value".
In your particular Example, printf evaluates a++ first, reads 10 and prints it and only then increments it to 11. printf then evaluates ++a, increments it first, reads 12 and prints it out. The last variable printf evaluates is read as it is (12) and is printed without any change.
Although they are evaluated in a random order, they are displayed in the order you mentioned them. That's why you get 12 10 12 and not 10 12 12.

How does an equal to expression work in a printf placeholder? [duplicate]

This question already has answers here:
Why are these constructs using pre and post-increment undefined behavior?
(14 answers)
Closed 5 years ago.
I have the following code snippet:
main( )
{
int k = 35 ;
printf ( "\n%d %d %d", k == 35, k = 50, k > 40 ) ;
}
which produces the following output
0 50 0
I'm not sure I understand how the first value of the printf comes to 0. When the value of k is compared with 35, it should ideally return (and thus print) 1, but how is it printing zero? The other two values that are produced- 50 and 0 are all right, because in the second value, the value of k is taken as 50, and for the third value- the value of k(which is 35) is compared with 40. Since 35 < 40, so it prints 0.
Any help would be appreciated, thanks.
**UPDATE**
After researching more on this topic and also on undefined behavior, I came across this in a book on C, source is given at the end.
Calling Convention
Calling convention indicates the order in which arguments arepassed to a function when a function call is encountered. There are two possibilities here:
Arguments might be passed from left to right.
Arguments might be passed from right to left.
C language follows the second order.
Consider the following function call:
fun (a, b, c, d ) ;
In this call it doesn’t matter whether the arguments are passed from left to right or from right to left. However, in some function call the order of passing arguments becomes an important consideration. For example:
int a = 1 ;
printf ( "%d %d %d", a, ++a, a++ ) ;
It appears that this printf( ) would output 1 2 3. This however is not the case. Surprisingly, it outputs 3 3 1.
This is because C’s calling convention is from right to left. That is, firstly 1 is passed through the expression a++ and then a is incremented to 2. Then result of ++a is passed. That is, a is incremented to 3 and then passed. Finally, latest value of a, i.e. 3, is passed. Thus in right to left order 1, 3, 3 get passed. Once printf( ) collects them it prints them in the order in which we have asked it to get them printed (and not the order in which they were passed). Thus 3 3 1 gets printed.
**Source: "Let Us C" 5th edition, Author: Yashwant Kanetkar, Chapter 5: Functions and Pointers**
Regardless of whether this question is a duplicate or not, I found this new information to be helpful to me, so decided to share. Note: This also supports the claim presented by Mr.Zurg in the comments below.
Unfortunately for all of those who read that book it is totally wrong. The draft C99 standard clearly make this code undefined behavior. A quick check with the Wikipedia entry on undefined behavior contain a similar example and identifies it as undefined. I won't leave it at that but there are other easily accessible sources that get this right without having to resort to the standard.
So what does the standard say? In section 6.5 Expressions paragraph 3 says:
The grouping of operators and operands is indicated by the syntax.74)
Except as specified later (for the function-call (), &&, ||, ?:, and
comma operators), the order of evaluation of subexpressions and the
order in which side effects take place are both unspecified.
So unless specified the order of evaluation of sub-expressions is unspecified, and further section 6.5.2.2 Function calls paragraph 10 says:
The order of evaluation of the function designator, the actual
arguments, and subexpressions within the actual arguments is
unspecified, but there is a sequence point before the actual call.
So in your example:
printf ( "\n%d %d %d", k == 35, k = 50, k > 40 ) ;
^ ^ ^ ^
1 2 3 4
sub-expression 1 to 4 could be evaluated in any order and we have no way of knowing when the side effects from each sub-expression will take place although we know they all have to take effect before the function is actually called.
So k = 50 could be evaluated first or last or anywhere in between and regardless of when it is evaluated the side effect of changing the value of k could take place immediately after or not until the actual function is executed. Which means the results is unpredictable could conceivably change change during different executions.
So next we have to tackle how this becomes undefined behavior. This is covered in section 6.5 paragraph 2 which says:
Between the previous and next sequence point an object shall have its
stored value modified at most once by the evaluation of an
expression.72) Furthermore, the prior value shall be read only to
determine the value to be stored.73)
So in your example we are not modifying k more than once but we are using the prior value to do other things besides determine the value to be stored. So what are the implications of undefined behavior? In the definition of undefined behavior the standard says:
Possible undefined behavior ranges from ignoring the situation
completely with unpredictable results, to behaving during translation
or program execution in a documented manner characteristic of the
environment (with or without the issuance of a diagnostic message), to
terminating a translation or execution (with the issuance of a
diagnostic message).
So the compiler could ignore it or could also produce unpredictable results, which covers a pretty wide range of undesirable behaviors. It has infamously been said that:
When the compiler encounters [a given undefined construct] it is legal
for it to make demons fly out of your nose
which sounds pretty undesirable to me.

Resources