This question already has answers here:
Why are these constructs using pre and post-increment undefined behavior?
(14 answers)
Closed 2 years ago.
I am reading a book for C and I got stuck into this example.
The author says that the result for this example will be x == 0 and y == 101.
I am fine with the y result,however I really thought that the first thing in the expression will calculate y == y and then will increment y +1.
I compiled the code and got a Warning: unsequenced modification and 1 was stored in x.
What's the reason for this?
int main(void)
{
int x,y=100;
x=y;
x= y == y++;
printf ("%d %d",x,y);
return 0;
}
An expression in C, including a subexpression of another expression, may have two effects:
A main effect: It produces a value to be further used in the containing expression.
A side effect: It modifies an object or file or accesses a volatile object.
In general, the C standard does not say when the side effect occurs. It does not necessarily occur at the same time that the main effect is evaluated.
In x = y == y++;, y++ has the side effect of modifying y. However, y is also used as the left operand of ==. Because the C standard does not say when the side effect will occur, it may occur before, during, or after using the value of y for the left operand.
A rule in the C standard says that if using the value of an object is not sequenced (specified to occur before or after) relative to a modification of that object, the behavior is undefined. Also, if two modifications are unsequenced relative to each other, the behavior is undefined.
An original motivation for this rule is that increment y for y++ could require multiple steps. In a computer with only 16-bit arithmetic, a C implementation might support a 32-bit int by using multiple instructions to get the low 16 bits of y, add 1 to them, remember the carry, store the resulting 16 bits, get the high 16 bits of y, add the carry, and store the resulting bits. If some other code is separately trying to get the value of y, it might get the low 16 bits after the 1 has been added but get the high 16 bits before the carry has been added, and the result could be a value of y that is neither the value before the add (e.g., 0x1ffff) nor the value after the add (e.g.&, 0x20000) but a mix (0x10000). You could say an implementation ought to track operations on objects and keep them separate so this interleaving does not occur. However, that can impose a burden on compilers and interfere with optimization.
In the expression y == y++ the order in which the expressions y and y++ are evaluated is unsequenced or, in simpler terms, the order is not specified.
However, here the result depends on the order in which the expressions are evaluated. Therefore the compiler emits a diagnostic.
Related
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 4 years ago.
Improve this question
I'm having trouble understanding the difference between unspecified and undefined behavior. I think trying to understand some examples would be useful. For instance, x = x++. The problem with this assignment is that:
Between the previous and next sequence point an object shall have its stored value
modified at most once by the evaluation of an expression. Furthermore, the prior value shall be read only to determine the value to be stored.
This violates a shall rule, but does not explicitly invoke undefined behavior, but it involves UB according to:
The order of evaluation of the operands is unspecified. If an attempt is made to modify the result of an assignment operator or to access it after the next sequence point, the behavior is undefined.
Assuming none of these rules existed and there are no other rules that "invalidate" x = x++. The value of x would then be unspecified, right?
The doubt arised because sometimes it is argued that things in C are UB by "default" are only valid you can justify that the construction is valid.
Edit: As pointed out by P.W, there is a somewhat related, well-received, version of this question for C++: What made i = i++ + 1; legal in C++17?.
I'm having trouble understanding the difference between unspecified and undefined behavior.
Then let's start with the definitions of those terms from the Standard:
undefined behavior behavior, upon use of a nonportable or erroneous program construct or of erroneous data, for which this
International Standard imposes no requirements
NOTE Possible undefined behavior ranges from ignoring the situation completely with unpredictable results, to behaving during
translation or program execution in a documented manner characteristic
of the environment (with or without the issuance of a diagnostic
message), to terminating a translation or execution (with the issuance
of a diagnostic message).
EXAMPLE An example of undefined behavior is the behavior on integer overflow.
(C2011, 3.4.3)
unspecified behavior use of an unspecified value, or other behavior where this International Standard provides two or more
possibilities and imposes no further requirements on which is chosen
in any instance
EXAMPLE An example of unspecified behavior is the order in which the
arguments to a function are evaluated.
(C2011, 3.4.4)
You remark that
The doubt arised because sometimes it is argued that things in C are
UB by "default" are only valid you can justify that the construction
is valid.
It is perhaps over-aggrandizing that to call it an argument, as if there were some doubt about its validity. In truth, it reflects explicit language from the standard:
If a ''shall'' or ''shall not'' requirement that appears outside of a
constraint or runtime- constraint is violated, the behavior is
undefined. Undefined behavior is otherwise indicated in this
International Standard by the words ''undefined behavior'' or by the
omission of any explicit definition of behavior. There is no
difference in emphasis among these three; they all describe ''behavior
that is undefined''.
(C2011, 4/2; emphasis added)
When you posit
Assuming none of these rules existed and there are no other rules that
"invalidate" x = x++.
, that doesn't necessarily change anything. In particular, removing the explicit rule that the order of evaluation of the operands is unspecified does not make the order specified. I'd be inclined to argue that the order remains unspecified, but the alternative is that the behavior would be undefined. The primary purpose served by explicitly saying it's unspecified is to sidestep that question.
The rule explicitly declaring UB when an object is modified twice between sequence points is a little less clear, but falls in the same boat. One could argue that the standard still did not define behavior for your example case, leaving it undefined. I think that's a bit more of a stretch, but that's exactly why it is useful to have an explicit rule, one way or the other. It would be possible to define behavior for your case -- Java does, for example -- but C chooses not to do, for a variety of technical and historical reasons.
The value of x would then be unspecified, right?
That's not entirely clear.
Please understand, too, that the various provisions of the standard for the most part do not stand alone. They are designed to work together, as a (mostly) coherent whole. Removing or altering random provisions has considerable risk of producing inconsistencies or gaps, leaving it difficult to reason about the result.
Modern C11/C17 has changed the text, but it has pretty much the same meaning. C17 6.5/2:
If a side effect on a scalar object is unsequenced relative to either a different side effect on the same scalar object or a value computation using the value of the same scalar
object, the behavior is undefined.
There are several slightly different issues here, mixed into one:
Between sequence points, x is written to (side effect) more than once. This is UB as per the above.
Between sequence points, the expression contains at least one side effect and there is a value computation of the same variable not related to which value to be stored. This is also UB as per the above.
In the expression x = x++, the evaluation of the operand x is not sequenced in relation to the operand x++. The evaluation order is unspecified behavior as per C17 6.5.16.
The side effect of updating the stored value of the left operand is
sequenced after the value computations of the left and right operands. The evaluations of
the operands are unsequenced.
If not for the first cited part labelling this UB, then we still wouldn't know if the x++ would be sequenced before or after the evaluation of the left x operand, so it is hard to reason about how this could become "just unspecified behavior".
C++17 actually fixed this part, making it well-defined there, unlike in C or earlier C++ versions. They did so by defining the sequence order (C++17 8.5.18):
In all cases, the assignment is sequenced after the value
computation of the right and left operands, and before the value computation of the assignment expression.
The right operand is sequenced before the left operand.
I don't see how there can be any middle-ground here; either the expression is undefined or it is well-defined.
Unspecified behavior is deterministic behavior which we cannot know or assume anything about. But unlike undefined behavior, it won't cause crashes and random program behavior.
A good example is a() + b(). We can't know which function that will be executed first - the program doesn't even have to be consistent if the same line appears later on in the same program. But we can know that both functions will be executed, one before the other.
Unlike x = a() + b() + x++; which is undefined behavior and we can't assume anything about it. One, both or none of the functions might be executed, in any order. The program might crash, produce incorrect results, produce seemingly correct results or do nothing at all.
There have been instances in other programming languages when a previously undefined behavior has become defined in a later standard. One instance I can remember is in C++ where what was undefined behavior in C++11 became well defined in C++17.
i = i++ + 1; // the behavior is undefined in C++11
i = i++ + 1; // the behavior is well-defined in C++17. The value of i is incremented
There has been a well received question on this topic.
What made this well defined is a guarantee in the C++17 standard that
The right operand is sequenced before the left operand.
So in a sense, it is upto the standards committee people to change the standard and provide strong guarantees to make it well defined.
But I do not think that something as simple as x = x++; will be made unspecified. It's will either be undefined or well-defined.
The problem seems that it cannot be properly defined what i= i++; would mean:
Interpretation 1:
int i1= i;
int i2= i1+1;
i = i2;
i = i1;
In this interpretation the value of i is retrieved and 1 is added (i2), then this i2 is saved to i but the original i in i1 is further used in the assignment (because here the ++ is interpreted to apply to the value after it has been used) and so i is unchanged.
Interpretation 2:
int i1= i;
i1= i1+1;
i= i1;
int i2= i;
i= i2;
In this interpretation the i++ is performed first (and modifies i) and now the modified i is retrieved again and used in the assignment (so i has the incremented value).
Interpretation 3:
int i1= i;
i = i1;
int i2= i1+1;
i= i2;
In this interpretation first the assignment of i to i is executed and then i is incremented.
To me, all these three interpretations are correct, and there could even be a few more interpretations, but they each do something different. Hence the standard could/did not define it and which interpretation a compiler uses is up to the compiler builder and as a result which behavior a compiler exhibits is undefined: undefined behavior.
(A compiler could even generate a jmp toTheMoon instruction or ignore the whole statement.)
The order of evaluation and application of the side effect of ++ is left unspecified - the language standard does not mandate left-to-right or right-to-left order (for arithmetic operators, anyway). Consider the well-defined expression a = b++ * ++c. The expressions a, b++, and ++c may be evaluated in any order. Similarly, the side effects to b and c may be applied immediately after evaluation, or deferred until just before the next sequence point, or anywhere in between. All that matters is that the result of b * (c+1) is computed before being assigned to a. The following is one perfectly legal evaluation:
tmp <- c + 1;
a = b * tmp;
c <- c + 1
b <- b + 1
So is this:
c <- c + 1
a <- b * c
b <- b + 1
So is this:
tmp1 <- b
b <- b + 1
tmp2 <- c + 1
a <- tmp1 * tmp2
c <- c + 1
What matters is that, no matter what order of evaluation is chosen, you will always get the same result.
x = x++ could be evaluated in either of the following ways, depending on when the side effect is applied:
Option 1 Option 2
-------- --------
tmp <- x tmp <- x
x <- x + 1 x <- tmp
x <- tmp x <- x + 1
The problem is that the two methods give different results. Other, completely different methods may be available based on the instruction set that give different results than these two.
The language standard doesn't mandate what to do when an expression gives different results depending on the order in which it is evaluated - it doesn't place any requirements on the compiler or the runtime environment to pick either option. This is what undefined means - literally, the behavior is not defined by the language specification. You will get a result, but it's not guaranteed to be consistent, or the result you would expect.
Undefined does not mean illegal. Nor does it mean your code is guaranteed to crash. It just means that the result is not predictable or guaranteed to be consistent. An implementation doesn't even have to issue a diagnostic saying "hey, dummy, this is a bad idea."
An implementation is free to define and document a behavior left undefined by the standard (such as MSVC defining fflush on input streams). A number of compilers take advantage of certain behaviors being undefined to perform some optimizations. And some compilers do issue warnings for common mistakes like x = x++.
This question already has answers here:
Why are these constructs using pre and post-increment undefined behavior?
(14 answers)
Closed 5 years ago.
Why doesn't the value of x increment in the following code?
#include <stdio.h>
int main(){
int x = 3, i = 0;
do {
x = x++;
i++;
} while (i != 3);
printf("%d\n", x);
}
In x = x++, you are saying both to increment x and to assign x a value. The C standard does not define what happens if you do both of these things “at the same time.” For this purpose, “at the same time” means your code has not arranged for one of them to occur before the other.
If you put the increment and the assignment in separate statements, the C standard says that one of them occurs before the other, and then the behavior is fully defined. Technically, there is a sequence point between two such statements. Sequence points are barriers that separate effects. Inside the single statement x = x++, there is no sequence point.
(There is more about sequence points and sequencing in C, but the details are beyond the scope of this question.)
In the simplest C implementation, the compiler might treat x = x++; as if it were x = x; x++; or as if it where int Temporary = x; x++; x = Temporary;. The first would set x to 3 and then to 4. The second would set x to 4 and then to 3. However, the C standard gives implementations a great deal of latitude. In some C implementations, integer types might be made up of parts—a small computer might not be able to handle 32-bit integers all at once, so it might have to do arithmetic in multiple 16-bit steps, or even multiple 8-bit steps. The C standard says, since you have not arranged for the assignment and the increment to occur in a particular order, then, not only is the implementation allowed to do them in either order, it is even allowed to mix the steps. It might do one byte of the assignment, one byte of the increment, the second byte of the assignment, the second byte of the increment, the third byte of the increment, the third byte of the assignment, and so on. In general, you could get a nonsensical answer that is a mishmash of operations on the parts.
So the C standard does not say that, if you do not arrange for the operations to be ordered, then either one happens before the other. It says, if you do not arrange for the operations to be ordered, we do not guarantee what will happen at all. You may get a big mess.
Can someone tell me what is happening behind the scenes here?
main()
{
int z, x=5, y=-10, a=4, b=2;
z = ++x - --y*b/a;
printf("%d", z);
}
Strange way to write code man...anyway...I try...
Resolves ++x and --y
Resolves multiplications and divisions
Resolves the remaining (plus and minus)
So...
z= 6 - (-11) * 2 /4
z= 6 - (-22) / 4
z= 6 - (-5) (the result is truncated due to (-22) / 4 being an integer division)
I get z= 11.
The variable z is declared int so it becomes 11.
I suggest to write this line in a simpler way!! Oh...sorry for my english...
This expression
z=++x - --y*b/a;
is evaluated in the following order in an abstract machine
Variable x is incremented and becomes equal to 6.
Variable y is decremented and becomes equal to -11.
Variable y is multiplied by variable b and the result is equal to -22.
The result of the preceding operation is divided by variable a and as there is used the integer arithmetic the result is equal to -5.
At last there is subtraction of the result from variable x and the result is equal to 11.
Run the program and be sure whether I am correct.
Take into account that a particular implementation may evaluate the operands in a different order provided that the result will be the same as I described for the abstract machine.
According to the C Standard (5.1.2.3 Program execution)
4 In the abstract machine, all expressions are evaluated as specified
by the semantics. An actual implementation need not evaluate part of
an expression if it can deduce that its value is not used and that no
needed side effects are produced (including any caused by calling a
function or accessing a volatile object).
First of all, please note the difference between operator precedence and order of evaluation of sub expressions.
Operator precedence dictates which operations that have to be done and evaluated, before the result of those operations are used together with the rest of the expression. This works similarly to mathematical precedence: 1 + 1 * 2 is guaranteed to give result 3, not 4. Because * has higher precedence than +.
Order of evaluation equals the actual order of execution, and is unspecified behavior, meaning that a compiler is free to execute the various sub expressions in any order it likes, in order to produce the fastest possible code. And we can't know the order. Most operators in C involve unspecified order of evaluation (except some special cases like && || ?: ,).
For example the in the case of x = y() + z(), we can know that + operation will get executed before =, but we can't tell which of the functions y and z that will get executed first. It may or may not matter to the result, depending on what the functions do.
Then to the expression in the question:
Operator precedence dictates that the two operations ++x and --y must be evaluated before the other operations, since the prefix unary operators have highest precedence of those present in the expression.
Which sub expression of ++x and --y*b/a that is evaluated first is not specified. We can't tell the order of execution (and --y*b/a does in turn contain several sub expressions). At any rate, the order of evaluation does not matter here, it will not affect the result.
The increments/decrements ++x and --y will take place before the results of those operations are used together with the rest of the expression.
Operator precedence then dictates that the operations involving * and / must be evaluated next. These operators have the same precedence, but they belong to the multiplicative operators group, which has left-to-right associativity, meaning that --y*b/a is guaranteed to evaluate --y*b first. After that, the result will get divided by a.
So the whole right-most sub expression is equivalent to ( (--y) * b ) / a.
Next, operator precedence dictates that - has higher precedence than =. So the result of the sub expressions ++x is subtracted by the result of the sub expression --y*b/a .
And finally the result is assigned to z, since = had the lowest precedence.
EDIT
Btw, the proper way to write the same, and get the very same machine code, is this:
++x;
--y;
z = x - (y*b)/a;
Apart from giving reduced readability, the ++ and -- operators are dangerous to mix with other operators since they contain a side effect. Having more than one side effect per expression could easily lead to various forms of unsequenced processing, which is always a bug, possibly severe. See this for examples.
"Operator precedence" means the rules for deciding what the operands are of each operator. In your case, using parentheses to indicate:
z=++x - --y*b/a;
is equivalent to:
z = ((++x) - (((--y) * b) / a));
Now, this line of code and the following printf statement has the same observable behaviour as the code:
z = (x + 1) - ((y - 1) * b / a);
printf("%d\n", z);
x = x + 1;
y = y - 1;
C is defined in terms of observable behaviour (which approximately means the output generated by the program; you can see a technical definition by reading the C standard). Any two programs which would produce the same observable behaviour according to the standard , are considered to be exactly equivalent.
This is sometimes called the "as-if rule" and it is this rule that allows optimization to occur.
Addressing points raised by some of the other answers:
There are rules surrounding exactly what ++ and -- do. Specifically, the effects of incremeting x and decrementing y are defined so that the writing back of the increment and decrement could happen at any time during the execution of z=++x - --y*b/a; . They could be in either order, or simultaneous; the writing of the decrement could be either before or after the computation of (y-1) * b, and so on.
In some different code examples, we would use these rules to work out the observable behaviour of the program, and it would not be quite so flexible as this particular program.
But in this code example, since nothing else depends on the timing of those increments and decrements, it turns out that we can even hoist them past the printf, according to the "as-if rule".
This question already has answers here:
Why are these constructs using pre and post-increment undefined behavior?
(14 answers)
Closed 8 years ago.
I have another C pointers question.
Consider executing the following program:
int x[5] = {0,3,5,7,9};
int* y = &x[2];
*(y+2) = *(y--);
What values does the array x hold afterwards?
What the hell is going on with y--? I know how *(y+2) works, and understand the rest, but not how y-- ties in with the rest.
Also, the answer given is {0, 3, 5, 5, 9}.
There's no sequence point between y-- and y + 2 in *(y+2) = *(y--);, so whether y + 2 refers to &x[4] or &x[3] is unspecified. Depending on how your compiler does things, you can either get 0 3 5 5 9 or 0 3 5 7 5.
What it means that there is no sequence point between the two expressions is, in a nutshell, that it is not specified whether the side effects of one operation (--y in this case) have been applied by the time the other (y - 2) is evaluated. You can read more about sequence points here.
ISO/IEC 9899:201x
6.5 Expressions
p2: If a side effect on a scalar object is unsequenced relative to either a different side effect
on the same scalar object or a value computation using the value of the same scalar
object, the behavior is undefined. If there are multiple allowable orderings of the
subexpressions of an expression, the behavior is undefined if such an unsequenced side
effect occurs in any of the orderings.
You should not trust the answers given by your professor in this case.
Expanding on Wintermute's answer a bit...
The problem is with the statement
*(y+2) = *(y--);
The expression y-- evaluates to the current value of y, and as a side effect decrements the variable. For example:
int a = 10;
int b;
b = a--;
After the above expression has been evaluated, b will have the value 10 and a will have the value 9.
However, the C language does not require that the side effect be applied immediately after the expression has been evaluated, only that it be applied before the next sequence point (which in this case is at the end of the statement). Neither does it require that expressions be evaluated from left to right (with a few exceptions). Thus, it's not guaranteed that the value of y in y+2 represents the value of y before or after the decrement operation.
The C language standard explicitly calls operations like this out as undefined behavior, meaning that the compiler is free to handle the situation in any way it wants to. The result will vary based on the compiler, compiler settings, and even the surrounding code, and any answer will be equally correct as far as the language definition is concerned.
In order to make this well-defined and give the same result, you would need to decrement y before the assignment statement:
y--;
*(y+2) = *y;
This is consistently one of the most misunderstood and mis-taught aspects of the C language. If your professor is expecting this particular result to be well-defined, then he doesn't know the language as well as he thinks he does. Then again, he's not unique in that respect.
Repeating and expanding on the snippet from the C 2011 draft standard that Wintermute posted:
6.5 Expressions
...
2 If a side effect on a scalar object is unsequenced relative to either a different side effect
on the same scalar object or a value computation using the value of the same scalar
object, the behavior is undefined. If there are multiple allowable orderings of the
subexpressions of an expression, the behavior is undefined if such an unsequenced side
effect occurs in any of the orderings.84)
3 The grouping of operators and operands is indicated by the syntax.85) Except as specified
later, side effects and value computations of subexpressions are unsequenced.86)
84) This paragraph renders undefined statement expressions such as
i = ++i + 1;
a[i++] = i;
while allowing
i = i + 1;
a[i] = i;
85) The syntax specifies the precedence of operators in the evaluation of an expression, which is the same
as the order of the major subclauses of this subclause, highest precedence first. Thus, for example, the
expressions allowed as the operands of the binary + operator (6.5.6) are those expressions defined in
6.5.1 through 6.5.6. The exceptions are cast expressions (6.5.4) as operands of unary operators
(6.5.3), and an operand contained between any of the following pairs of operators: grouping
parentheses () (6.5.1), subscripting brackets [] (6.5.2.1), function-call parentheses () (6.5.2.2), and
the conditional operator ? : (6.5.15).
Within each major subclause, the operators have the same precedence. Left- or right-associativity is
indicated in each subclause by the syntax for the expressions discussed therein.
86) In an expression that is evaluated more than once during the execution of a program, unsequenced and
indeterminately sequenced evaluations of its subexpressions need not be performed consistently in
different evaluations.
Emphasis added. Note that this has been true since the C89 standard, although the wording has changed a bit since then.
"Unsequenced" simply means it's not guaranteed that one operation is completed before the other. The assignment operator does not introduce a sequence point, so it's not guaranteed that the LHS of the expression is evaluated before the RHS.
Now for the hard bit - your professor obviously expects a specific behavior for these kinds of expressions. If he gives a test or a quiz that asks what the result of something like a[i] = i--; will be, he's probably not going to accept an answer of "the behavior is undefined", at least not on its own. You might want to discuss the answers Wintermute and I have given with him, along with the sections of the standard quoted above.
The problem is in this statement.
*(y+2) = *(y--);
Because in C, reading a variable twice in an expression (in which it's modified) has undefined behavior.
Another example is:
i = 5;
v[i] = i++;
In this case the most likely to happen (AFAIK) is that the compiler first evalue RHS or LHS, if LHS is first evaluated, then we will have v[5] = 5; and after the assignment i will be equal to 6, if instead of that RHS is evaluated in the first place, then we will have that the evaluation of the right side will be equal to 5, but when we start evaluating the left side i will be equal to 6, so we will end up with v[6] = 5;, however, given the quote "undefined behavior allow the compiler to do anything it chooses, even to make demons fly out of your nose" you should not expect one of those options, instead of that you should expect anything, because it depends on the compiler what happens.
First of all int x[5] = {0, 3, 5, 7, 9} means
x[0] = 0, x[1] = 3, x[2] = 5, x[3] = 7, x[4] = 9
Next int *y = &x[2] Here you are trying to use pointer y to point the address of x[2]
Now here comes to your confusion *(y + 2) means you are pointing address of x[4]
and *(y--), here y-- is a post decrement operator, hence first of all the the value at *y must be used which is x[2] = 5 so now the value assigned is x[4] = 5.
The final output would be 0 3 5 7 5
As everyone knows, this loops through zero:
while (x-- > 0) { /* also known as x --> 0 */
printf("x = %d\n", x);
}
But x = x-- yields undefined behaviour.
Both examples need some 'return' value of x--, which is not there I guess. How can it be that x-- > 0 is defined but x = x-- is not?
Because in x = x-- you're modifying the value of x twice without an intervening sequence point. So the order of operations is not defined. In x-- > 0 the value of x is modified once, and it is clearly defined that result of evaluating x-- will be the value of x before the decrement.
I don't know where you got that idea about "need some 'return' value of x--, which is not there". Firstly, it is not exactly clear what you mean. Secondly, regardless of what you mean this doesn't seem to have anything to do with the source of undefined behavior in x = x--.
x = x-- produces undefined behavior because it attempts to modify x twice without an intervening sequence point. No "need" for any "return value" is involved here.
The underlying problem with x = x-- is that it has two side-effects that occur at undefined moments in undefined order. One side-effect is introduced by the assignment operator. Another side-effect is introduced by postfix -- operator. Both side-effects attempt to modify the same variable x and generally contradict each other. This is why the behavior in such cases is declared undefined de jure.
For example, if the original value of x was 5, then your expression requires x to become both 4 (side-effect of decrement) and 5 (side-effect of assignment) at the same time. Needless to say, it is impossible for x to become 4 and 5 at the same time.
Although such a straightforward contradiction (like 4 vs 5) is not required for UB to occur. Every time you have two side-effects hitting the same variable without intervening sequence point, the behavior is undefined, even if the values these side-effects are trying to put into the variable match.
In order to understand this you need to have a basic understanding of sequence points. See this link: http://en.wikipedia.org/wiki/Sequence_point
For the = operator there is no sequence point, so there is no guarantee that the value of x will be modified before it is again assigned to x.
When you are checking the condition in the while loop x-- > 0, x-- is evaluated and the value is used in the relational operator evaluation so there is no chance of undefined behaviour because x is getting modified only once.
Just to add something to other answers, try reading this wikipedia page about sequence points.
I suggest reading https://stackoverflow.com/a/21671069/258418. If you chuck together that = is not a sequence point, and the compiler is free to interleave operations, as long as they are not separated by a sequence point from the answers linked by you, you see that i.e. the following two sequences would be legal:
load i to reg
increment i
assign reg to i
=> i has previous value of i
load i to reg
assign reg to i
increment i
=> i has value of previous value of i + 1
In general: avoid assigning (this includes modiying by pre/post ++/--) to the same variable twice in one expression.