issue with assignment operator inside printf() - c

Here is the code
int main()
{
int x=15;
printf("%d %d %d %d",x=1,x<20,x*1,x>10);
return 0;
}
And output is 1 1 1 1
I was expecting 1 1 15 1 as output,
x*1 equals to 15 but here x*1 is 1 , Why ?
Using assignment operator or modifying value inside printf() results in undefined behaviour?

Your code produces undefined behavior. Function argument evaluations are not sequenced relative to each other. Which means that modifying access to x in x=1 is not sequenced with relation to other accesses, like the one in x*1. The behavior is undefined.
Once again, it is undefined not because you "used assignment operator or modifying value inside printf()", but because you made a modifying access to variable that was not sequenced with relation to other accesses to the same variable. This code
(x = 1) + x * 1
also has undefined behavior for the very same reason, even though there's no printf in it. Meanwhile, this code
int x, y;
printf("%d %d", x = 1, y = 5);
is perfectly fine, even though it "uses assignment operator or modifying value inside printf()".

Within a function call, the function parameters may be evaluated in any order.
Since one of the parameters modifies x and the others access it, the results are undefined.

The Standard states that;
Between the previous and next sequence point an object shall have its stored value modified at most once by the evaluation of an expression. Furthermore, the prior value shall be accessed only to determine the value to be stored.
It doesn't impose an order of evaluation on sub-expressions unless there's a sequence point between them, and rather than requiring some unspecified order of evaluation, it says that modifying an object twice produces undefined behaviour.

Related

unable to understand how expressions are getting false [duplicate]

This question already has answers here:
Undefined behavior and sequence points
(5 answers)
Parameter evaluation order before a function calling in C
(7 answers)
Closed 8 years ago.
int main( )
{
int k = 35 ;
printf ( "\n%d %d %d", k == 35, k = 50, k > 40 ) ;
}
The output of the above program is "0 50 0" without quotes. My question is how k==35 and k > 40 are false? From my perspective, k is assigned 35 so "k==35" should be true; then k is assigned 50 in "k=50", so "k > 40" has to be true, doesn't it?
The printf call invokes undefined behaviour.
The order of evaluation of arguments to a function is unspecified behaviour. It means that the arguments can be evaluated in any order from a set of all possible orders. The standard does not require the implementation to enforce any order. Also, the comma which separates the arguments is not the comma operator. This means there is no sequence point between the evaluation of the arguments. The only requirement is that all the arguments must be fully evaluated and all side effects must have taken place before the function is called. Now, the C99 standard §6.5 ¶2 says
Between the previous and next sequence point an object shall have its
stored value modified at most once by the evaluation of an expression.
Furthermore, the prior value shall be read only to determine the value
to be stored.
The evaluation of the arguments to a function is unsequenced. The assignment expression k = 50 evaluates to 50 but its side effect is to assign 50 to k. If this expression is evaluated first and its side effect immediately takes place, then the expression k == 35 would evaluate to true and k > 40 would evaluate to false. Thus depending on when the side effect of expression the k = 50 takes place, the other two expressions would evaluate to true or false. However, the other two expressions violate the second point in the above quoted part from the standard. The expressions access k but it is not to change the value of k. This violates the prior value shall be read only to determine the value
to be stored clause. This is the reason for the undefined behaviour.
Just to emphasize, the undefined behaviour in your code is not because the order of evaluation of arguments is unspecified (not undefined, which is different), but it's because there's no sequence point between the evaluation of the arguments, and as a result, the above quoted part from the standard is violated in this context.
Undefined behaviour means the behaviour of the code is unpredictable. Anything can happen from program crash to your hard drive getting formatted to demons flying out your nose (to launch into hyperbole). This is only to emphasize that the standard does not require the implementation to deal with such cases and you should always avoid such code.
Furhter reading -
What Every C Programmer Should Know About Undefined Behavior
Undefined Behavior and Sequence Points
Why is f(i = -1, i = -1) undefined behavior?
Order of evaluation
Your code invokes undefined behaviour because of the assignment in the argument list:
k = 50
You presumably want:
printf("%d %d %d\n", k == 35, k == 50, k > 40);
Since you embed an assignment in the argument list, any result is permissible. There is no required order of evaluation for arguments; they could be evaluated in any order — any result is correct. For your own sanity, do not ever write code that depends on the order of evaluation of arguments.
If the book you are learning from is not simply emphasizing that the order of evaluation is undefined and that any undefined behaviour is bad, then you should probably discard the book.
It is Undefined Behavior, which means anything may happen.
Why is it UB?
The answer is simply that there is no ordering constraint, not even indeterminate sequencing, between a write of k and a read of k which is not used to determine the new value of k.
printf ( "\n%d %d %d", k == 35, k = 50, k > 40 );
// No ordering constraint between function arguments.
Aside: If the write happened in a function called in the argument list instead (not in the arguments to that function), it would be indeterminately sequenced and thus not UB.
Indeterminately sequenced means it is sequenced before or after without determining which. It is not quite as bad as UB (anything goes), but it should still not be relied on, as the order is unreliable.
int SetVar(int *p, int v) {return *p = v;}
printf ( "\n%d %d %d", k == 35, SetVar(&k, 50), k > 40 );
// No ordering constraint between function arguments, but
// the function call is indeterminately ordered

Unexpected working of printf() in C [duplicate]

This question already has answers here:
Why are these constructs using pre and post-increment undefined behavior?
(14 answers)
Closed 9 years ago.
#include<stdio.h>
int main()
{
int i=2;
printf("%d %d \n",++i,++i);
}
The above code gives an output 4 4. Can any one help how to explain the output?
++i is a prefix increment. Printf should first evaluate its arguments before printing them (although in which order, is not guaranteed and, strictly speaking, undefined - See Wiki entry on Undefined Behavior: http://en.wikipedia.org/wiki/Undefined_behavior ).
The prefix increment is called "increment and fetch", i.e. it first increments the value and then gives it to the caller.
In your case, i was first incremented twice, and only afterwards the output was formatted and sent to the console.
It has to do with sequence points and it can result in undefined behavior.
Straight from Wikipedia:
Before a function is entered in a function call. The order in which
the arguments are evaluated is not specified, but this sequence point
means that all of their side effects are complete before the function
is entered.
More info here: http://en.wikipedia.org/wiki/Sequence_point
Both the answers have made the same mistake. Its clear cut UB and not just unspecified behavior.
What you have experienced is Undefined behavior. Please read about sequence points. Comma is a separator in function calls not an operator.
A sequence point is a point in time at which the dust has settled and all side effects which have been seen so far are guaranteed to be complete. The sequence points listed in the C standard are:
at the end of the evaluation of a full expression (a full
expression is an expression statement, or any other expression which
is not a subexpression within any larger expression);
at the ||, &&, ?:, and comma operators; and
at a function call (after the evaluation of all the arguments, and just before the actual call).
The Standard states that
Between the previous and next sequence point an object shall have its
stored value modified at most once by the evaluation of an expression.
Furthermore, the prior value shall be accessed only to determine the
value to be stored.
What will be evaluated first this "%d %d \n", this ++i or this ++i (second one) - think about it. This would be unspecified behavior:
void f(int x)
{
printf("%d ",x);
}
int main()
{
int i=0;
f(i++) ;
}
From wiki:
Before a function is entered in a function call. The order in which
the arguments are evaluated is not specified, but this sequence point
means that all of their side effects are complete before the function
is entered. In the expression f(i++) + g(j++) + h(k++), f is called
with a parameter of the original value of i, but i is incremented
before entering the body of f. Similarly, j and k are updated before
entering g and h respectively. However, it is not specified in which
order f(), g(), h() are executed, nor in which order i, j, k are
incremented. Variables j and k in the body of f may or may not have
been already incremented. Note that a function call f(a,b,c) is not a
use of the comma operator and the order of evaluation for a, b, and c
is unspecified.

Defined behaviour for expressions

The C99 Standard says in $6.5.2.
Between the previous and next sequence point an object shall have its stored value
modified at most once by the evaluation of an expression. Furthermore, the prior value
shall be read only to determine the value to be stored.
(emphasis by me)
It goes on to note, that the following example is valid (which seems obvious at first)
a[i] = i;
While it does not explicitly state what a and i are.
Although I believe it does not, I'd like to know whether this example covers the following case:
int i = 0, *a = &i;
a[i] = i;
This will not change the value of i, but access the value of i to determine the address where to put the value. Or is it irrelevant that we assign a value to i which is already stored in i? Please shed some light.
Bonus question; What about a[i]++ or a[i] = 1?
The first sentence:
Between the previous and next sequence point an object shall have its
stored value modified at most once by the evaluation of an expression.
is clear enough. The language doesn't impose an order of evaluation on subexpressions unless there's a sequence point between them, and rather than requiring some unspecified order of evaluation, it says that modifying an object twice produces undefined behavior. This allows aggressive optimization while still making it possible to write code that follows the rules.
The next sentence:
Furthermore, the prior value shall be read only to determine the value to be stored
does seem unintuitive at first (and second) glance; why should the purpose for which a value is read affect whether an expression has defined behavior?
But what it reflects is that if a subexpression B depends on the result of a subexpression A, then A must be evaluated before B can be evaluated. The C90 and C99 standards do not state this explicitly.
A clearer violation of that sentence, given in an example in the footnote, is:
a[i++] = i; /* undefined behavior */
Assuming that a is a declared array object and i is a declared integer object (no pointer or macro trickery), no object is modified more than once, so it doesn't violate the first sentence. But the evaluation of i++ on the LHS determines which object is to be modified, and the evaluation of i on the RHS determines the value to be stored in that object -- and the relative order of the read operation on the RHS and the write operation on the LHS is not defined. Again, the language could have required the subexpressions to be evaluated in some unspecified order, but instead it left the entire behavior undefined, to permit more aggressive optimization.
In your example:
int i = 0, *a = &i;
a[i] = i; /* undefined behavior (I think) */
the previous value of i is read both to determine the value to be stored and to determine which object it's going to be stored in. Since a[i] refers to i (but only because i==0), modifying the value of i would change the object to which the lvalue a[i] refers. It happens in this case that the value stored in i is the same as the value that was already stored there (0), but the standard doesn't make an exception for stores that happen to store the same value. I believe the behavior is undefined. (Of course the example in the standard wasn't intended to cover this case; it implicitly assumes that a is a declared array object unrelated to i.)
As for the example that the standard says is allowed:
int a[10], i = 0; /* implicit, not stated in standard */
a[i] = i;
one could interpret the standard to say that it's undefined. But I think that the second sentence, referring to "the prior value", applies only to the value of an object that's modified by the expression. i is never modified by the expression, so there's no conflict. The value of i is used both to determine the object to be modified by the assignment, and the value to be stored there, but that's ok, since the value of i itself never changes. The value of i isn't "the prior value", it's just the value.
The C11 standard has a new model for this kind of expression evaluation -- or rather, it expresses the same model in different words. Rather than "sequence points", it talks about side effects being sequenced before or after each other, or unsequenced relative to each other. It makes explicit the idea that if a subexpression B depends on the result of a subexpression A, then A must be evaluated before B can be evaluated.
In the N1570 draft, section 6.5 says:
1 An expression is a sequence of operators and operands that
specifies computation of a value, or that designates an object
or a function, or that generates side effects, or that performs
a combination thereof. The value computations of the operands
of an operator are sequenced before the value computation of the
result of the operator.
2 If a side effect on a scalar object is unsequenced relative to
either a different side effect on the same scalar object or a
value computation using the value of the same scalar object, the
behavior is undefined. If there are multiple allowable orderings
of the subexpressions of an expression, the behavior is undefined
if such an unsequenced side effect occurs in any of the orderings.
3 The grouping of operators and operands is indicated by the syntax.
Except as specified later, side effects and value computations
of subexpressions are unsequenced.
Reading an object's value to determine where to store it does not count as "determining the value to be stored". This means that the only point of contention can be whether or not we are "modifying" the object i: if we are, it is undefined; if we are not, it is OK.
Does storing the value 0 into an object which already contains the value 0 count as "modifying the stored value"? By the plain English definition of "modify" I would have to say not; leaving something unchanged is the opposite of modifying it.
It is, however, clear that this would be undefined behaviour:
int i = 0, *a = &i;
a[i] = 1;
Here there can be no doubt that the stored value is read for a purpose other than determining the value to be stored (the value to be stored is a constant), and that the value of i is modified.

Why is `x-- > 0` not undefined behaviour, while `x = x--` is?

As everyone knows, this loops through zero:
while (x-- > 0) { /* also known as x --> 0 */
printf("x = %d\n", x);
}
But x = x-- yields undefined behaviour.
Both examples need some 'return' value of x--, which is not there I guess. How can it be that x-- > 0 is defined but x = x-- is not?
Because in x = x-- you're modifying the value of x twice without an intervening sequence point. So the order of operations is not defined. In x-- > 0 the value of x is modified once, and it is clearly defined that result of evaluating x-- will be the value of x before the decrement.
I don't know where you got that idea about "need some 'return' value of x--, which is not there". Firstly, it is not exactly clear what you mean. Secondly, regardless of what you mean this doesn't seem to have anything to do with the source of undefined behavior in x = x--.
x = x-- produces undefined behavior because it attempts to modify x twice without an intervening sequence point. No "need" for any "return value" is involved here.
The underlying problem with x = x-- is that it has two side-effects that occur at undefined moments in undefined order. One side-effect is introduced by the assignment operator. Another side-effect is introduced by postfix -- operator. Both side-effects attempt to modify the same variable x and generally contradict each other. This is why the behavior in such cases is declared undefined de jure.
For example, if the original value of x was 5, then your expression requires x to become both 4 (side-effect of decrement) and 5 (side-effect of assignment) at the same time. Needless to say, it is impossible for x to become 4 and 5 at the same time.
Although such a straightforward contradiction (like 4 vs 5) is not required for UB to occur. Every time you have two side-effects hitting the same variable without intervening sequence point, the behavior is undefined, even if the values these side-effects are trying to put into the variable match.
In order to understand this you need to have a basic understanding of sequence points. See this link: http://en.wikipedia.org/wiki/Sequence_point
For the = operator there is no sequence point, so there is no guarantee that the value of x will be modified before it is again assigned to x.
When you are checking the condition in the while loop x-- > 0, x-- is evaluated and the value is used in the relational operator evaluation so there is no chance of undefined behaviour because x is getting modified only once.
Just to add something to other answers, try reading this wikipedia page about sequence points.
I suggest reading https://stackoverflow.com/a/21671069/258418. If you chuck together that = is not a sequence point, and the compiler is free to interleave operations, as long as they are not separated by a sequence point from the answers linked by you, you see that i.e. the following two sequences would be legal:
load i to reg
increment i
assign reg to i
=> i has previous value of i
load i to reg
assign reg to i
increment i
=> i has value of previous value of i + 1
In general: avoid assigning (this includes modiying by pre/post ++/--) to the same variable twice in one expression.

Question about C programming

int a, b;
a = 1;
a = a + a++;
a = 1;
b = a + a++;
printf("%d %d, a, b);
output : 3,2
What's the difference between line 3 and 5?
What you are doing is undefined.
You can't change the value of a variable you are about to assign to.
You also can't change the value of a variable with a side effect and also try to use that same variable elsewhere in the same expression (unless there is a sequence point, but in this case there isn't). The order of evaluation for the two arguments for + is undefined.
So if there is a difference between the two lines, it is that the first is undefined for two reasons, and line 5 is only undefined for one reason. But the point is both line 3 and line 5 are undefined and doing either is wrong.
What you're doing on line 3 is undefined. C++ has the concept of "sequence points" (usually delimited by semicolons). If you modify an object more than once per sequence point, it's illegal, as you've done in line 3. As section 6.5 of C99 says:
(2) Between the previous and next sequence point an object shall have its stored value
modified at most once by the evaluation of an expression. Furthermore, the prior value
shall be read only to determine the value to be stored.
Line 5 is also undefined because of the second sentence. You read a to get its value, which you then use in another assignment in a++.
a++ is a post-fix operator, it gets the value of a then increments it.
So, for lines 2,3:
a = 1
a = 1 + 1, a is incremented.
a becomes 3 (Note, the order these operations are performed may vary between compilers, and a can easily also become 2)
for lines 4,5:
a = 1
b = 1 + 1, a is incremented.
b becomes 2, a becomes 2. (Due to undefined behaviour, b could also become 3 of a++ is processed before a)
Note that, other than for understanding how postfix operators work, I really wouldn't recommend using this trick. It's undefined behavior and will get different results when compiled using different compilers
As such, it is not only a needlessly confusing way to do things, but an unreliable, and worst-practice way of doing it.
EDIT: And has others have pointed out, this is actually undefined behavior.
Line 3 is undefined, line 5 is not.
EDIT:
As Prasoon correctly points out, both are UB.
The simple expression a + a++ is undefined because of the following:
The operator + is not a sequence point, so the side effects of each operands may happen in either order.
a is initially 1.
One of two possible [sensible] scenarios may occur:
The first operand, a is evaluated first,
a) Its value, 1 will be stored in a register, R. No side effects occur.
b) The second operand a++ is evaluated. It evaluates to 1 also, and is added to the same register R. As a side effect, the stored value of a is set to 2.
c) The result of the addition, currently in R is written back to a. The final value of a is 2.
The second operand a++ is evaluated first.
a) It is evaluated to 1 and stored in register R. The stored value of a is incremented to 2.
b) The first operand a is read. It now contains the value 2, not 1! It is added to R.
c) R contains 3, and this result is written back to a. The result of the addition is now 3, not 2, like in our first case!
In short, you mustn't rely on such code to work at all.

Resources