Using postfix increment in an L-value [duplicate] - c

This question already has answers here:
Closed 12 years ago.
Possible Duplicate:
In what order does evaluation of post-increment operator happen?
Consider the following snippet(in C):
uint8_t index = 10;
uint8_t arr[20];
arr[index++] = index;
When I compile this with gcc, it sets arr[10] to 10, which means that the postfix increment isn't being applied until after the entire assignment expression. I found this somewhat surprising, as I was expecting the increment to return the original value(10) and then increment to 11, thereby setting arr[10] to 11.
I've seen lots of other posts about increment operators in RValues, but not in LValue expressions.
Thanks.

Some standard language:
6.5 Expressions
1 An expression is a sequence of operators and operands that specifies computation of a
value, or that designates an object or a function, or that generates side effects, or that
performs a combination thereof.
2 Between the previous and next sequence point an object shall have its stored value
modified at most once by the evaluation of an expression.72) Furthermore, the prior value shall be read only to determine the value to be stored.73)
3 The grouping of operators and operands is indicated by the syntax.74) Except as specified later (for the function-call (), &&, ||, ?:, and comma operators), the order of evaluation of subexpressions and the order in which side effects take place are both unspecified.
Paragraph 2 explicitly renders expressions of the form a[i++] = i undefined; the prior value of i isn't just being read to determine the result of i++. Thus, any result is allowed.
Beyond that, you cannot rely on the side effect of the ++ operator to be applied immediately after the expression is evaluated. For an expression like
a[i++] = j++ * ++k
the only guarantee is that the result of the expression j++ * ++k is assigned to the result of the expression a[i++]; however, each of the subexpressions a[i++], j++, and ++k may be evaluated in any order, and the side effects (assigning to a[i], updating i, updating j, and updating k) may be applied in any order.

The line
arr[index++] = index;
causes undefined behaviour. Read the C standard for more details.
The essence you should know is: You should never read and change a variable in the same statement.

Assignment operation works from right to left, i.e. right part of expression calculated first and then it is assigned to the left part.

Related

Are a[i]=y++; and a[i++]=y; undefined behavior or unspecified in C language?

When I was looking for the expression v[i++]=i; why it is to define the behavior, I suddenly saw an explanation because the expression exists between two sequence points in the program, and the c standard stipulates that in the two sequence points The order of occurrence of the side effects is uncertain, so when the expression is run in the program, it is not sure whether the ++ operator is operated first or the = operator is operated first. I am puzzled by this. When the expression is evaluated In the process, shouldn't the priority be used to judge first, and then the sequence point should be introduced to judge which sub-expression is executed first? Am I missing something?
When user AnT stands with Russia explained it like this, does it mean that writing in the code such as a[i]=y++; or a[i++]=y; in the program can not be sure ++ operator and = operator can not determine who runs first.
The reason v[i++]=i; is undefined behavior is because the variable i is both read and written in the same expression without sequencing.
Expressions such as a[i]=y++ and a[i++]=y do not exhibit undefined behavior because no variable is both read and written in the expression without sequencing.
The = operator does however ensure that both of its operands are fully evaluated before the side effect of assigning to the left side. Specifically, a[i] is evaluated to be an lvalue designating the ith element of the array a, and y++ is evaluated to be the current value of y.
The specific rule in the C standard is C 2018 6.5 2:
If a side effect on a scalar object is unsequenced relative to either a different side effect on the same scalar object or a value computation using the value of the same scalar object, the behavior is undefined. If there are multiple allowable orderings of the subexpressions of an expression, the behavior is undefined if such an unsequenced side effect occurs in any of the orderings.
The first sentence is the critical one here. First, consider v[i] = i++;. Here, the i in v[i] computes the value of i, and the i++ both computes the value of i and increments the stored value of i. Computing the value of i is a value computation of i. Incrementing the stored value of i is a side effect. To determine whether the behavior of v[i] = i++; is undefined, we ask whether the side effect is unsequenced relative to any other side effect on i or to a value computation on i.
There is no other side effect on i, so it is not unsequenced relative to any other side effect.
There is a value computation in i++, but the side effect and this value computation are sequenced by the specification of the postfix ++ operator. C 2018 6.5.2.4 2 says:
… The value computation of the result is sequenced before the side effect of updating the stored value of the operand…
So we know the computation of the value of i in i++ is sequenced before the side effect of incrementing the stored value.
Now we consider the value computation of the i in v[i]. The ++ specification does not tell us about this, so let’s consider the assignment operator, =. The specification of assignment does say something about sequencing, in C 2018 6.5.16 3:
… The side effect of updating the stored value of the left operand is sequenced after the value computations of the left and right operands. The evaluations of the operands are unsequenced.
The first sentence tells us the update of v[i] is sequenced after the value computations of the left and right operands. But it does not tell us anything about the side effect in ++ relative to the value computation of i in v[i].
Therefore, the value computation of i in v[i] is unsequenced relative to the side effect on i in i++, so the behavior of the statement is not defined by the C standard.
In a[i] = y++; we have:
A value computation on i in a[i].
A value computation on y in y++.
An update of the stored value of y in y++.
A value computation on a in a[i].
An update of the stored value of a[i] in a[i] = ….
The only object that is updated twice or that is both updated and evaluated is y, and we know from above that the value computation on y in y++ is sequenced before the update of y. So this statement does not contain any side effect that is unsequenced relative to another side effect or value cmputation on the same object. So its behavior is not undefined by the rule in C 2018 6.5 2.
Similarly, in a[i++] = y;, we have:
A value computation on i in a[i++].
An update of the stored value of i in i++.
A value computation on y.
A value computation on a in a[i].
An update of the stored value of a[i] in a[i++] = ….
Again, there is only one object with two operations on it, and those operations are sequenced. The behavior is not undefined by the rule in C 2018 6.5 2.
Note
In the above, we assume neither a nor v is a pointer such that a[i] or v[i] would be i or y. If instead we consider this code:
int y = 3;
int *a = &y;
int i = 0;
a[i] = y++;
Then the behavior is undefined because a[i] is y, so the code updates y twice, once for the assignment a[i] = … and once for y++, and these updates are unsequenced. The specification of assignment says the update to the left operand is sequenced after the value computation of the result (which is the value of the right side of the assignment), but the increment for ++ is a side effect, not part of the value computation. So the two updates are unsequenced, and the behavior is not defined by the C standard.
An attempt to explain the "standardese" terms plainly:
The standard says (C17 6.5) that in an expression, a side effect of a variable may not occur in an unsequenced order in relation to a value computation of that same object.
To make sense of these strange terms:
Side effect = writing to a variable or perform a read or write access to a volatile variable.
Value computation = reading the value from memory.
Unsequenced = The order between accesses/evaulations is not specified nor well-defined. C has the concept of sequence points, which are certain points in the program that when reached, previous side effects must have been evaluated. For example, a ; introduces a sequence point. Two parts of an expression are unsequenced in relation to each other when the order of evaluation of each part is not well-defined before the next sequence point. (A complete list of all sequence points can be found in C17 Annex C.)
So when translated from standardese to English, v[i++]=i; has undefined behavior since i is written to in an unspecified order related to the other read of i in the same expression. How do we know that?
The assignment operator = says that (6.5.16) "the evaluations of the operands are unsequenced", refering to the left and right operands of =.
The postfix ++ operator says that (6.5.2.4) "As a side effect, the
value of the operand object is incremented" and "The value computation of the result is sequenced before the side effect of updating the stored value of the operand". In practice meaning that i is first read and the ++ is applied later, though before the next sequence point, in this case the ;.
In case of a[i]=y++; or a[i++]=y; everything happens on different variables. There are two side effects, updating i (or y) and updating a[i] but they are done on different objects, so both examples are well-defined.
The C standard (C11 draft) says the following about the postfix ++ operator:
(6.5.2.4.2) The result of the postfix ++ operator is the value of the operand. As a side effect, the value of the operand object is incremented (that is, the value 1 of the appropriate type is added to it). [...]
A sequence point is defined by a point in the code where it is guaranteed that all side effects before the point have taken effect and no side effects after the point have taken effect.
There is no intermediate sequence points in the expression v[i++] = i;. Thus it is not defined whether the side effect of the expression i++ (incrementing i) takes effect before or after the right-hand side i is evaluated. Thus it is the value of the right-hand side i which is not defined in this expression.
This problem does not exist in the expression a[i++] = y; because the value of the right-hand side y is not affected by the side effect of i++.
When the expression is evaluated In the process
Which expression?
v[i++]=i;
is a statement. It consists of a toplevel assignment expression a = b, where a and b are both themselves expressions.
The left-hand expression a is itself of the form c[d], where d is another subexpression of the form d ++ and d is yet another expression, finally resolved to i.
If it helps we can write the whole thing out in pseudo-function-call style, like
assign(array_index(v, increment_and_return_old_value(i)), i);
Now, the problem is that the standard doesn't tell us whether the final value parameter i is obtained before or after i is mutated by increment_and_return_old_value(i) (or by i++).
... and then the sequence point should be introduced to judge which sub-expression is executed first?
The , in a function call parameter list isn't a sequence point. The relative order in which function parameters are evaluated is not defined (only that they must all have been evaluated before the function body is entered).
The same logic applies to the original code - the standard says there is no sequence point, so there is no sequence point.
does it mean that writing in the code such as a[i]=y++; or a[i++]=y; in the program can not be sure ++ operator and = operator can not determine who runs first.
It's not the assignment that is the problem, it is evaluating the right-hand operand to be assigned.
And, in these cases, there is no relationship between left-hand side thing being assigned to and the right-hand side value being assigned. So although we still cannot be sure which is evaluated first, it doesn't matter.
If I wrote out explicitly
int *lhs = &a[i];
int rhs = y++;
*lhs = rhs;
then reversing the first two lines would make no difference. Their relative order doesn't matter, so the lack of a defined relative order doesn't matter.
Conversely, for completeness,
int *lhs = v[i++];
int rhs = i;
*lhs = rhs;
is the original case where the order of the first two lines does matter, and the fact that it is unspecified is a problem.

Undefined behaviour seems to be contradicting with operator precedence rule in C [duplicate]

This question already has answers here:
Why are these constructs using pre and post-increment undefined behavior?
(14 answers)
Closed 4 years ago.
Consider the following line of code.
int i = 2;
i = i++
The second line of code has been identified as undefined. I know this question has been asked before several times and example being this.
But nowhere could I see the issue of operator precedence being addressed in this issue. It has been clearly mentioned that postfix operator precedes assignment operator.
i = (i++)
So clearly i++ will be evaluated first and this value of i is the assigned to i again.
This looks like this particular undefined behavior is contradicting the precedence rule.
Similar to this is the code:
int i = 2;
i++ * i++;
Here according to operator precedence the code can be written as
int i =2;
(i++) * (i++)
Now we do not know whether the (i++) in LHS or RHS of '*' operator is going to be evaluated first. But either way it is going produce the same result. So how is it undefined?
If we write say:
int p;
p = f1() + f2()
where f1() and f2() are defined functions then obviously it's clear we can't decide whether f1() or f2() is going to be evaluated first as precedence rules does not specify this. But a confusion like this does not seem to arise in the current problem.
Please explain.
I do not understand why this question got a negative vote. I needed a clarity between operator precedence and UB and I have seen no other question addressing it.
What you're looking for is in section 6.5 on Expressions, paragraph 3 of the C standard:
The grouping of operators and operands is indicated by the syntax.
Except as specified later, side effects and value computations of
subexpressions are unsequenced.
This means that the side effect of incrementing (or decrementing) via the ++ or -- operators doesn't necessarily happen immediately when the operator is encountered. The only guarantee is that it happens before the next sequence point.
In the case of i = i++;, there is no sequence point in the evaluation of the operands of = nor in the evaluation of postfix ++. So an implementation is free to perform assigning the current value of i to itself and the side effect of incrementing of i in any order. So i could potentially be either 2 or 3 in your example.
This goes back to paragraph 2:
If a side effect on a scalar object is unsequenced relative
to either a different side effect on the same scalar object
or a value computation using the value of the same scalar
object, the behavior is undefined.
Since i = i++ attempts to update i more than once without a sequence point, it invokes undefined behavior. The result could be 2 or 3, or something else might happen as a result of optimizations for example.
The reason that this is not undefined:
int p;
p = f1() + f2()
Is because a variable is not being updated more than once in a sequence point. It could however be unspecified behavior if both f1 and f2 update the same global variables, since the evaluation order is unspecified.
The problem with using
i = i++
is that the order in which the address of i is accessed to read and write is not specified. As a consequence, at the end of that line, the value of i could be 3 or 2.
When will it be 3?
Evaluate the RHS - 2
Assign it to the LHS. i is now 2.
Increment i. i is now 3.
When will it be 2?
Evaluate the RHS - 2
Increment i. i is now 3.
Assign the result of evaluating the RHS to the LHS. i is now 2.
Of course, if there is a race condition, we don't know what's going to happen.
But nowhere could I see the issue of operator precedence being addressed in this issue.
Operator precedence only affects how expressions are parsed (which operands are grouped with which operators) - it has no effect on how expressions are evaluated. Operator precedence says that a * b + c should be parsed as (a * b) + c, but it doesn't say that either a or b must be evaluated before c.
Now we do not know whether the (i++) in LHS or RHS of '*' operator is going to be evaluated first. But either way it is going produce the same result. So how is it undefined?
Because the side effect of the ++ operator does not have to be applied immediately after evaluation. Side effects may be deferred until the next sequence point, or they may applied before other operations, or sprinkled throughout. So if i is 2, then i++ * i++ may be evaluated as 2 * 2, or 2 * 3, or 3 * 2, or 2 * 4, or 4 * 4, or something else entirely.
OPERATOR PRECEDENCE DOES NOT RESOLVE UNDEFINED EXPRESSION ISSUES.
Sorry for shouting, but people ask about this all the time. I'm afraid I have to say your research on this question must not have been very thorough, if you didn't see this aspect being discussed.
See for example this essay.
The expression i = i++ tries to assign to object i twice. It's therefore undefined. Period. Precedence doesn't save you.

Using of several increment/decrement in the same statement

I know that order of computations in C is not strict, so value of expression --a + ++a is undefined because it's unknown which part of statement runs first.
But, what if I known that order of computations is irrelevant in a particular case? For example:
All modifications correspond to different variables (like in a[p1++] = b[p2++])
Order do not matter, like in a++ + ++a - the result is two no matter which side of + is calculated first. Is it guaranteed that one the parts will be calculated fully before running the another? I.e. compiler is unable to remember result of a++, the result of ++a and then apply first a++, getting one instead of two? For example, caching initial value of a and passing it as argument to two operators independently.
I'm interested in answers about C, C99, C11, C++03 and C++11, if there is any difference between all of them.
The standard says:
Between the previous and next sequence point an object shall have
its stored value modified at most once by the evaluation of an
expression. Furthermore, the prior value shall be accessed only to
determine the value to be stored. /26/
Except as indicated by the syntax /27/ or otherwise specified later
(for the function-call operator () , && , || , ?: , and comma
operators), the order of evaluation of subexpressions and the order in
which side effects take place are both unspecified.
So:
1.) a[p1++] = b[p2++]: It is guaranteed that the statement is evaluated correctly and gives the expected result. This is because each variable is modified only once and the result does not depend on the time when the actual increment of both variables is done.
2.) a++ + ++a: It is not guaranteed that the side effect (increment) is performed before the second usage of a. Hence this expression can give the value a + (a+1) or (a+1) + (a+1) or a + (a+2) depending on when your compiler performs the side effect increments of the original variable.
Online C 2011 standard:
6.5 Expressions
...
3 The grouping of operators and operands is indicated by the syntax.85) Except as specified
later, side effects and value computations of subexpressions are unsequenced.86)
85) The syntax specifies the precedence of operators in the evaluation of an expression, which is the same
as the order of the major subclauses of this subclause, highest precedence first. Thus, for example, the
expressions allowed as the operands of the binary + operator (6.5.6) are those expressions defined in
6.5.1 through 6.5.6. The exceptions are cast expressions (6.5.4) as operands of unary operators
(6.5.3), and an operand contained between any of the following pairs of operators: grouping
parentheses () (6.5.1), subscripting brackets [] (6.5.2.1), function-call parentheses () (6.5.2.2), and
the conditional operator ? : (6.5.15).
Within each major subclause, the operators have the same precedence. Left- or right-associativity is
indicated in each subclause by the syntax for the expressions discussed therein.
86) In an expression that is evaluated more than once during the execution of a program, unsequenced and
indeterminately sequenced evaluations of its subexpressions need not be performed consistently in
different evaluations.
Emphasis added.
There's no guarantee that the side effect of either a++ or ++a is applied before the other expression is evaluated, so you can get different results depending on the sequence of operations.
Here are several cases, assuming a starts out at 1:
Left to right evaluation, side effects applied immediately: (1) + (2+1) == 4
Left to right evaluation, side effects deferred: (1) + (1+1) == 3
Right to left evaluation, side effects applied immediately: (2) + (1+1) == 4
Right to left evaluation, side effects deferred: (1) + (1+1) == 3
Or any other combination.

C Programming : Confusion between operator precedence

I am confused between precedence of operators and want to know how this statement would be evaluated.
# include <stdio.h>
int main()
{
int k=35;
printf("%d %d %d",k==35,k=50,k>40);
return 0;
}
Here k is initially have value 35, when I am testing k in printf I think :
k>40 should be checked which should result in 0
k==35 should be checked and which should result in 1
Lastly 50 should get assigned to k and which should output 50
So final output should be 1 50 0, but output is 0 50 1.
You can not rely on the output of this program since it is undefined behavior, the evaluation order is not specified in C since that allows the compiler to optimize better, from the C99 draft standard section 6.5 paragraph 3:
The grouping of operators and operands is indicated by the syntax.74) Except as specified
later (for the function-call (), &&, ||, ?:, and comma operators), the order of evaluation of subexpressions and the order in which side effects take place are both unspecified.
It is also undefined because you are accessing the value of k and assigning to it in the same sequence point. From draft standard section 6.5 paragraph 2:
Between the previous and next sequence point an object shall have its stored value
modified at most once by the evaluation of an expression. Furthermore, the prior value
shall be read only to determine the value to be stored.
it cites the following code examples as being undefined:
i = ++i + 1;
a[i++] = i;
Update
There was a comment as to whether the commas in the function call acted as a sequence point or not. If we look at section 6.5.17 Comma operator paragraph 2 says:
The left operand of a comma operator is evaluated as a void expression; there is a
sequence point after its evaluation.
but paragraph 3 says:
EXAMPLE As indicated by the syntax, the comma operator (as described in this subclause) cannot appear in contexts where a comma is used to separate items in a list (such as arguments to functions or lists of initializers).
So in this case the comma does not introduce a sequence point.
The order in which function arguments are evaluated is not specified. They can be evaluated in any order. The compiler decides.
This is undefined behaviour.
You may get any value. Lack of sequence points in two consecutive execution. Increase strictness level for warning and you will get warning: operation on ‘k’ may be undefined.

Why doesn’t this code: a[i] = i++; work? [duplicate]

This question already has answers here:
Why are these constructs using pre and post-increment undefined behavior?
(14 answers)
Closed 9 years ago.
a[i] = i++;
Why does the above code not work?
What is wrong with above code? I am asking this question to improve my knowledge.
Because the ISO standard says that you are not allowed to change a variable more than once (or change and use one) without an intervening sequence point.
There is no sequence point between the use of i in a[i] and the change of i in i++.
The list of sequence points from C11 (not really changed that much since C99) are described in Annex C:
Between the evaluations of the function designator and actual arguments in a function call and the actual call.
Between the evaluations of the first and second operands of the following operators: logical AND &&; logical OR ||; comma ,.
Between the evaluations of the first operand of the conditional ?: operator and whichever of the second and third operands is evaluated.
The end of a full declarator: declarators;
Between the evaluation of a full expression and the next full expression to be evaluated. The following are full expressions: an initializer; the expression in an expression statement; the controlling expression of a selection statement (if or switch); the controlling expression of a while or do statement; each of the expressions of a for statement; the expression in a return statement.
Immediately before a library function returns.
After the actions associated with each formatted input/output function conversion specifier.
Immediately before and immediately after each call to a comparison function, and also between any call to a comparison function and any movement of the objects passed as arguments to that call.
and 5.1.2.3 Program execution states:
Evaluations A and B are indeterminately sequenced when A is sequenced either before or after B, but it is unspecified which.
The presence of a sequence point between the evaluation of expressions A and B implies that every value computation and side effect associated with A is sequenced before every value computation and side effect associated with B.
Section 6.5 Expressions pretty much covers your exact case:
If there are multiple allowable orderings of the subexpressions of an expression, the behavior is undefined if such an unsequenced side effect occurs in any of the orderings.
This paragraph renders undefined statement expressions such as i = ++i + 1; and a[i++] = i; while allowing i = i + 1; and a[i] = i;.
It does work, but possibly not as expected. The problem is if it's not clear if i gets incremented before the assignment and, if so, then a[i] will reference the next item in the array.
Your question was very terse so you can expand on it if you want more information. But it's just hard to tell exactly which element of a that syntax assigns to.

Resources