Why is `x-- > 0` not undefined behaviour, while `x = x--` is? - c

As everyone knows, this loops through zero:
while (x-- > 0) { /* also known as x --> 0 */
printf("x = %d\n", x);
}
But x = x-- yields undefined behaviour.
Both examples need some 'return' value of x--, which is not there I guess. How can it be that x-- > 0 is defined but x = x-- is not?

Because in x = x-- you're modifying the value of x twice without an intervening sequence point. So the order of operations is not defined. In x-- > 0 the value of x is modified once, and it is clearly defined that result of evaluating x-- will be the value of x before the decrement.

I don't know where you got that idea about "need some 'return' value of x--, which is not there". Firstly, it is not exactly clear what you mean. Secondly, regardless of what you mean this doesn't seem to have anything to do with the source of undefined behavior in x = x--.
x = x-- produces undefined behavior because it attempts to modify x twice without an intervening sequence point. No "need" for any "return value" is involved here.
The underlying problem with x = x-- is that it has two side-effects that occur at undefined moments in undefined order. One side-effect is introduced by the assignment operator. Another side-effect is introduced by postfix -- operator. Both side-effects attempt to modify the same variable x and generally contradict each other. This is why the behavior in such cases is declared undefined de jure.
For example, if the original value of x was 5, then your expression requires x to become both 4 (side-effect of decrement) and 5 (side-effect of assignment) at the same time. Needless to say, it is impossible for x to become 4 and 5 at the same time.
Although such a straightforward contradiction (like 4 vs 5) is not required for UB to occur. Every time you have two side-effects hitting the same variable without intervening sequence point, the behavior is undefined, even if the values these side-effects are trying to put into the variable match.

In order to understand this you need to have a basic understanding of sequence points. See this link: http://en.wikipedia.org/wiki/Sequence_point
For the = operator there is no sequence point, so there is no guarantee that the value of x will be modified before it is again assigned to x.
When you are checking the condition in the while loop x-- > 0, x-- is evaluated and the value is used in the relational operator evaluation so there is no chance of undefined behaviour because x is getting modified only once.

Just to add something to other answers, try reading this wikipedia page about sequence points.

I suggest reading https://stackoverflow.com/a/21671069/258418. If you chuck together that = is not a sequence point, and the compiler is free to interleave operations, as long as they are not separated by a sequence point from the answers linked by you, you see that i.e. the following two sequences would be legal:
load i to reg
increment i
assign reg to i
=> i has previous value of i
load i to reg
assign reg to i
increment i
=> i has value of previous value of i + 1
In general: avoid assigning (this includes modiying by pre/post ++/--) to the same variable twice in one expression.

Related

Unsequenced modification warning [duplicate]

This question already has answers here:
Why are these constructs using pre and post-increment undefined behavior?
(14 answers)
Closed 2 years ago.
I am reading a book for C and I got stuck into this example.
The author says that the result for this example will be x == 0 and y == 101.
I am fine with the y result,however I really thought that the first thing in the expression will calculate y == y and then will increment y +1.
I compiled the code and got a Warning: unsequenced modification and 1 was stored in x.
What's the reason for this?
int main(void)
{
int x,y=100;
x=y;
x= y == y++;
printf ("%d %d",x,y);
return 0;
}
An expression in C, including a subexpression of another expression, may have two effects:
A main effect: It produces a value to be further used in the containing expression.
A side effect: It modifies an object or file or accesses a volatile object.
In general, the C standard does not say when the side effect occurs. It does not necessarily occur at the same time that the main effect is evaluated.
In x = y == y++;, y++ has the side effect of modifying y. However, y is also used as the left operand of ==. Because the C standard does not say when the side effect will occur, it may occur before, during, or after using the value of y for the left operand.
A rule in the C standard says that if using the value of an object is not sequenced (specified to occur before or after) relative to a modification of that object, the behavior is undefined. Also, if two modifications are unsequenced relative to each other, the behavior is undefined.
An original motivation for this rule is that increment y for y++ could require multiple steps. In a computer with only 16-bit arithmetic, a C implementation might support a 32-bit int by using multiple instructions to get the low 16 bits of y, add 1 to them, remember the carry, store the resulting 16 bits, get the high 16 bits of y, add the carry, and store the resulting bits. If some other code is separately trying to get the value of y, it might get the low 16 bits after the 1 has been added but get the high 16 bits before the carry has been added, and the result could be a value of y that is neither the value before the add (e.g., 0x1ffff) nor the value after the add (e.g.&, 0x20000) but a mix (0x10000). You could say an implementation ought to track operations on objects and keep them separate so this interleaving does not occur. However, that can impose a burden on compilers and interfere with optimization.
In the expression y == y++ the order in which the expressions y and y++ are evaluated is unsequenced or, in simpler terms, the order is not specified.
However, here the result depends on the order in which the expressions are evaluated. Therefore the compiler emits a diagnostic.

Post-Decrement operator [duplicate]

This question already has answers here:
Why are these constructs using pre and post-increment undefined behavior?
(14 answers)
Closed 7 years ago.
some body please tell me what will be the value of x after(in c language)
x=1;
x=x--&&++x;
I think it should be 0 because x&&++x will give 1 and post decrement will make it 0.
But when I entered this on computer result was 1.
Why post decrement is not working here.
I am thinking like this:
precedence of pre increment is above && so both x should be treated as 2 (Boolean value true ) so x&&++x will give 1 and the post-decrement should decrement it to 0.
This is not a duplicate question as this is not the case of undefined behavior its about how post-decrement works.
x=x--&&++x;
This causes undefined behaviour as value of x is changed more than once between two sequence points.
Expression x-- && ++x is well defined as it has internal sequence point due to && , but when you assign it to x , it causes undefined behaviour.
Therefore ,expression exhibits undefined behaviour.
C99 §6.5: “2. Between the previous and next sequence point an object shall have its stored value modified at most once by the evaluation of an expression. Furthermore, the prior value shall be read only to determine the value to be stored.”
While the result of (x--)&&(++x); is well defined, due to short circuit evaluation.
the result of your assignment
x = (x--)&&(++x); is not.
A simpler example would be:
x = x--;
which, as Paul and Art note in the comments:
modifies the value of x twice within the same expression without a
sequence point between the modifications.
EDIT: fixed my initial errornous answer, which stated that the result of the assignment is defined.
Inspite of the fact that there are exams in my university I wasn't able to stop thinking about this question and I think I have finally found the solution.
Firstly, post decrement operator will be executed as it is in the left of &&, it will decrement the value of x to 0 but x-- will be 1(previous value of x ) as left side is 1 right side will be executed here ++x will assign value 1 to x and value of ++x will also be 1 so && operator will return value 1.Now, although there is no sequence point between pre-increment operator and assignment operator both are assigning value 1 to x so this is totally defined that value of x will be 1 after the whole code is executed.

issue with assignment operator inside printf()

Here is the code
int main()
{
int x=15;
printf("%d %d %d %d",x=1,x<20,x*1,x>10);
return 0;
}
And output is 1 1 1 1
I was expecting 1 1 15 1 as output,
x*1 equals to 15 but here x*1 is 1 , Why ?
Using assignment operator or modifying value inside printf() results in undefined behaviour?
Your code produces undefined behavior. Function argument evaluations are not sequenced relative to each other. Which means that modifying access to x in x=1 is not sequenced with relation to other accesses, like the one in x*1. The behavior is undefined.
Once again, it is undefined not because you "used assignment operator or modifying value inside printf()", but because you made a modifying access to variable that was not sequenced with relation to other accesses to the same variable. This code
(x = 1) + x * 1
also has undefined behavior for the very same reason, even though there's no printf in it. Meanwhile, this code
int x, y;
printf("%d %d", x = 1, y = 5);
is perfectly fine, even though it "uses assignment operator or modifying value inside printf()".
Within a function call, the function parameters may be evaluated in any order.
Since one of the parameters modifies x and the others access it, the results are undefined.
The Standard states that;
Between the previous and next sequence point an object shall have its stored value modified at most once by the evaluation of an expression. Furthermore, the prior value shall be accessed only to determine the value to be stored.
It doesn't impose an order of evaluation on sub-expressions unless there's a sequence point between them, and rather than requiring some unspecified order of evaluation, it says that modifying an object twice produces undefined behaviour.

Does x = i++ in C have a defined behavior and why?

According to FAQ, i = i++ is a undefined behaviour in C because this statement has only one sequence point (full expression), and in this statement i has been changed twice (side effect of i++ and =) so it is an undefined behaviour.
Question 1 is do I get that correctly? Or do I misunderstand why i = i++ is undefined?
The second question is that is x = i++ an valid expression?
I guess it is valid and the value of x will always be the origin value of i. Because although there is only one sequence point in this statement, but both x and i are modified only once, and i++ has a higher precedence, which means it should be valid and x++ will always done before the assignment, make x equals to the origin value of x. Is that correct?
Yes, at least as of C++03. I believe C++11 changes this somewhat, but I can't get my hands on a copy of that standard to check.
Because you modify x once, and i once. There's no multiple writes to a single variable without an intervening sequence point.

Question about C programming

int a, b;
a = 1;
a = a + a++;
a = 1;
b = a + a++;
printf("%d %d, a, b);
output : 3,2
What's the difference between line 3 and 5?
What you are doing is undefined.
You can't change the value of a variable you are about to assign to.
You also can't change the value of a variable with a side effect and also try to use that same variable elsewhere in the same expression (unless there is a sequence point, but in this case there isn't). The order of evaluation for the two arguments for + is undefined.
So if there is a difference between the two lines, it is that the first is undefined for two reasons, and line 5 is only undefined for one reason. But the point is both line 3 and line 5 are undefined and doing either is wrong.
What you're doing on line 3 is undefined. C++ has the concept of "sequence points" (usually delimited by semicolons). If you modify an object more than once per sequence point, it's illegal, as you've done in line 3. As section 6.5 of C99 says:
(2) Between the previous and next sequence point an object shall have its stored value
modified at most once by the evaluation of an expression. Furthermore, the prior value
shall be read only to determine the value to be stored.
Line 5 is also undefined because of the second sentence. You read a to get its value, which you then use in another assignment in a++.
a++ is a post-fix operator, it gets the value of a then increments it.
So, for lines 2,3:
a = 1
a = 1 + 1, a is incremented.
a becomes 3 (Note, the order these operations are performed may vary between compilers, and a can easily also become 2)
for lines 4,5:
a = 1
b = 1 + 1, a is incremented.
b becomes 2, a becomes 2. (Due to undefined behaviour, b could also become 3 of a++ is processed before a)
Note that, other than for understanding how postfix operators work, I really wouldn't recommend using this trick. It's undefined behavior and will get different results when compiled using different compilers
As such, it is not only a needlessly confusing way to do things, but an unreliable, and worst-practice way of doing it.
EDIT: And has others have pointed out, this is actually undefined behavior.
Line 3 is undefined, line 5 is not.
EDIT:
As Prasoon correctly points out, both are UB.
The simple expression a + a++ is undefined because of the following:
The operator + is not a sequence point, so the side effects of each operands may happen in either order.
a is initially 1.
One of two possible [sensible] scenarios may occur:
The first operand, a is evaluated first,
a) Its value, 1 will be stored in a register, R. No side effects occur.
b) The second operand a++ is evaluated. It evaluates to 1 also, and is added to the same register R. As a side effect, the stored value of a is set to 2.
c) The result of the addition, currently in R is written back to a. The final value of a is 2.
The second operand a++ is evaluated first.
a) It is evaluated to 1 and stored in register R. The stored value of a is incremented to 2.
b) The first operand a is read. It now contains the value 2, not 1! It is added to R.
c) R contains 3, and this result is written back to a. The result of the addition is now 3, not 2, like in our first case!
In short, you mustn't rely on such code to work at all.

Resources