In C, the order of evaluation of operands has nothing to do with operator precedence and associativity.
Suppose I have an expression in C: expr1 * expr2 + (expr3 + expr4) (no sequence points in between).
When this expression is evaluated then:
Will sub-expressions expr3 and expr4 be evaluated before expr1 and
expr2 because of the parenthesis?
Or does the parenthesis ensure that operators inside the parenthesis are evaluated before the operators outside the parenthesis?
Do the parentheses ensure order of evaluation of operands or operators?
From the online draft of the 2011 language standard:
6.5 Expressions
...
3 The grouping of operators and operands is indicated by the syntax.85) Except as specified
later, side effects and value computations of subexpressions are unsequenced.86)
85) The syntax specifies the precedence of operators in the evaluation of an expression, which is the same as the order of the major subclauses of this subclause, highest precedence first. Thus, for example, the expressions allowed as the operands of the binary + operator (6.5.6) are those expressions defined in
6.5.1 through 6.5.6. The exceptions are cast expressions (6.5.4) as operands of unary operators
(6.5.3), and an operand contained between any of the following pairs of operators: grouping
parentheses () (6.5.1), subscripting brackets [] (6.5.2.1), function-call parentheses () (6.5.2.2), and
the conditional operator ? : (6.5.15).
Within each major subclause, the operators have the same precedence. Left- or right-associativity is
indicated in each subclause by the syntax for the expressions discussed therein.
86) In an expression that is evaluated more than once during the execution of a program, unsequenced and
indeterminately sequenced evaluations of its subexpressions need not be performed consistently in
different evaluations.
Clear as mud, right? What it means is that, given an expression like
x = a++ + b++ * (--c / ++d)
each of the subexpressions a++, b++, --c, and ++d may be evaluated in any order; just because --c and ++d are grouped by parens doesn't mean they're evaluated first. Furthermore, the side effects of each ++ and -- don't have to be applied immediately after the expression is evaluated.
All operator precedence guarantees is that the result of --c / ++d will be multiplied by the result of b++, and the result of a++ will be added to that value; it does not guarantee that any expression is evaluated before any other.
Pay close attention to footnote 86; if the above expression appeared in a loop, there's no reason to expect that the subexpressions would be evaluated in the same order every time through the loop. As a practical matter, they most likely will be, but the compiler is explicitly given the freedom to shake things up.
Because of this freedom to evaluate expressions and apply side effects in any order, certain expressions like a++ + a++ won't give consistent results; the standard explicitly calls this out as undefined behavior, meaning the compiler isn't obligated to handle the situation in any particular way. It can ignore the issue, it can issue a warning, it can halt translation with an error, etc., but there's no requirement that it do any one particular thing.
1.will sub expressions expr3 and expr4 be evaluated before expr1 and expr2 becoz of paranthesis?
No. The evaluation order of expr1, expr2, expr3 and expr4 is unspecified by the C standard. The compiler can evaluate these sub-expressions in any order it prefers, and can interleave them with the operator evaluation if it wants.
2.or paranthesis ensures that operators in paranthesis will be evaluated before than operators outside the paranthesis?
Yes. The parentheses will override the operator precedence from the normal (((expr1 * expr2) + expr3) + expr4) to ((expr1 * expr2) + (expr3 + expr4)). This just determines in which relative order the operators are evaluated.
It can help to show the evaluation order constraints as a tree :
+
________|________
* +
____|____ ____|____
expr1 expr2 expr3 expr4
The evaluation of a node in this tree requires that its children have been evaluated already. But that's the only restriction.
The evaluation order could be : expr2, expr3, expr4, +, expr1, *, +.
Or it could be : expr4, expr3, +, expr1, expr2, *, +.
Or any other permutation that fits the restriction mentioned above.
The parentheses guarantee that what's inside them is evaluated before anything that depends on them. No more.
In your example, (expr3+expr4) is evaluated before the + that adds it to expr1*expr2. It doesn't mean that it's evaluated before expr1*expr2.
Related
() has the highest priority, why is it short-circuited?
int a = 1, b = 0;
(--a)&&(b++);
Why is (b++) still short-circuited?
I don't find the term "short-circuit" particularly helpful.
It is better to say that && and || have left-to-right order of evaluation. The C standard (6.5.13) actually explains this well:
the && operator guarantees left-to-right evaluation ...
If the first operand compares equal to 0, the second
operand is not evaluated.
And that's it. The left operand --a evaluates to 0.
What is the difference between operator precedence and order of evaluation?
In this case, all that matters is order of evaluation. Operator precedence only guarantees that the expression is parsed as expected - which operands that "glue" to which operator.
The expression (--a) uses the prefix operator and evaluates to 0. This is false and the 2nd expression is not evaluated due to short-circuit as you noted.
To understand consider the following expression
a * b + c * d
If to assume that the operands of the operator + are evaluated from left to right then at first the expression a * b will be evaluated and then the expression c * d.
The assumption relative to the order of evaluations of operands of the operator + is not valid. But it is valid for the operator &&.
So left operand of the expression
(--a)&&(b++);
is evaluated first. And if its value is not equal to 0 then the second operand is evaluated.
From the C Standard (6.5.13 Logical AND operator)
4 Unlike the bitwise binary & operator, the && operator guarantees
left-to-right evaluation; if the second operand is evaluated, there
is a sequence point between the evaluations of the first and second
operands. If the first operand compares equal to 0, the second operand
is not evaluated.
You are right that you can override operator precedence using parenthesis ().
For example, the expression
a && b || c
is equivalent to
(a && b) || c,
because && has a higher precedence than ||. You can override this precedence by changing it to
a && (b || c)
so that now the || has precedence over &&.
However, operator procedence doesn't change the behavior of the individual operators. In particular, both operators will still use short-circuiting. So, even if || has higher precedence than &&, the && operator will still evaluate its left-hand operand before its right-hand operand.
In both of my examples above, the && operator will still evalulate a before evaluating b or (b || c). Operator precedence will only affect whether the right-hand operand of the && operator is the expression b or (b || c).
Therefore, in your example, by writing (--a)&&(b++) instead of --a&&b++, you are merely ensuring that the -- and ++ operators have precedence over the && operator. This is not necessary, because he C language specifies that these operators already have precedence.
In the hypothetical scenario that the C language had instead specified that the && operator had precedence over the -- and ++ operators, then the expression --a&&b++ would be interpreted as --(a&&b)++, and it would be necessary to write (--a)&&(b++) to prevent this interpretation. But this is not the case.
I have learnt that logical operator are guaranteed that their evaluation are from left-to right but I was wondering what are the order of evaluation of comparison operator. For instance expression1 < expression2 in other words is it guaranteed that expression1 will be first evaluated before expression2.
According to the standard:
J.1 Unspecified behavior
The following are unspecified:....
— The order in which subexpressions are evaluated and the order in which side effects
take place, except as specified for the function-call (), &&, ||, ?:, and comma
operators (6.5).
Generally speaking, the order of evaluation of subexpressions within an expression is undefined.
The only place where there is an order, i.e. sequence points, is the || (logical OR), && (logical AND), , (comma), and ?: (ternary) operators.
In the case of &&, if the expression on the left evaluates to false (i.e. 0), the result is known to be false and the right side is not evaluated. Similarly for || if the expression on the left evaluates to true (i.e. not 0), the result is known to be true and the right side is not evaluated.
For the ternary operator, the conditional is evaluated first. If it evaluates to true then only the middle part is evaluated, otherwise only the third part is evaluated.
For the comma operator, the left side is evaluated first, then the right side.
From the C standard:
6.5.13.4 Unlike the bitwise binary & operator, the && operator guarantees left-to-right evaluation; there is a sequence point
after the evaluation of the first operand. If the first
operand compares equal to 0, the second operand is not evaluated.
...
6.5.14.4 Unlike the bitwise | operator, the || operator guarantees left-to-right evaluation; there is a sequence point after the
evaluation of the first operand. If the first operand
compares unequal to 0, the second operand is not evaluated.
...
6.5.15.4 The first operand is evaluated; there is a sequence point after its evaluation. The second operand is evaluated only if the
first compares unequal to 0; the third operand is evaluated only if
the first compares equal to 0; the result is the value of the second
or third operand (whichever is evaluated), converted to the type
described below. If an attempt is made to modify the result of a
conditional operator or to access it after the next sequence point,
the behavior is undefined.
....
6.5.17.2 The left operand of a comma operator is evaluated as a void expression; there is a sequence point after its
evaluation. Then the right operand is evaluated; the result has its
type and value. If an attempt is made to modify the result of a comma
operator or to access it after the next sequence point, the behavior
is undefined.
No, the spec does not mention the order of evaluation for the operand of relational operators. It's unspecified.
Just to add, relational operators are left-to-right associative.
I have executed the following code in Code::Blocks 10.05 on Windows 7.
int a=0,b=0,c;
c=a++&&b++;
printf("\na=%d\nb=%d\nc=%d\n\n",a,b,c);
The output I obtained is given below,
a=1
b=0
c=0
This makes perfect sense because of short circuit evaluation.
The expression a++ is post increment and 0 is returned to the logical and (&&). Hence the part b++ is not evaluated since both 0 && 0 and
0 && 1 evaluates to 0.
But here arises my doubt. The precedence value of operators clearly states that ++ is having higher precedence over &&. So my understanding was like this, both a++ and b++ are evaluated and then && only checks the result of expression a++ to come to a decision. But this has not happened only a++ is evaluated here.
What is the reason for this behavior? Does && being a sequence point has something to do with this behavior? If so why we say that && is having lower precedence than ++?
You are confused about precedence and order of evaluation.
Precedence defines how the operators are grouped, i.e
c = a++ && b++;
is equivalent to:
c = ((a++) && (b++));
Order of evaluation defines how the expression is evaluated, the short circuit of && means a++ is evaluated first, if it's zero, the end; if it's not zero, b++ is then evaluated.
As another example:
c = (a++) + (b++);
Is a++ evaluated before b++? The answer is we don't know. Most operators don't define the order of evaluation. && is one of the few operators that do define. (The rest are ||, , and ?:)
There are two concepts here - order of precedence and order of evaluation. Order of precedence will have an impact only if an expression (or sub-expression) is evaluated.
In general, the order of evaluation is not sequenced. Given an operator, its operands can be evaluated in any order. The arguments of a function can be evaluated in any order.
From the C++ Standard:
1.9 Program execution
15 Except where noted, evaluations of operands of individual operators and of subexpressions of individual expressions are unsequenced.
and
8.3.6 Default arguments
9 Default arguments are evaluated each time the function is called. The order of evaluation of function arguments is unspecified.
For the logical AND operator, &&, the C++11 standard says:
5.14 Logical AND operator
1 The && operator groups left-to-right. The operands are both contextually converted to type bool (Clause 4). The result is true if both operands are true and false otherwise. Unlike &, && guarantees left-to-right evaluation: the second operand is not evaluated if the first operand is false.
Similar exception is specified for the logical OR operator, ||.
Since b++ is not evaluated due to short circuiting of the expression because of && operator, the order of precedence of the operators has no significance in this particular case.
Will right side always evaluated to ahead of left side? And then the result of right side will be passed on to left side. I am not talking about the exception such as A[i]=i++
I am talking about the normal cases:
A[i] = (j+32+43 & K);
A[j] != (A[j] + A[k]);
will the right part of all these expression evaluated first and then the result is compared to the left side? (Always)
In general the order of evaluation of sub-expressions is unspecified, there are a few exceptions such as logical and, logical or, comma operator, etc...
Since you comment stated you are interested in the general rule:
any operstor #YuHao if there is any general rule
that would be covered by the draft C99 standard section 6.5 Expressions paragraph 3 which says (emphasis mine going forward):
The grouping of operators and operands is indicated by the syntax.74)
Except as specified later (for the function-call (), &&, ||, ?:, and
comma operators), the order of evaluation of subexpressions and the
order in which side effects take place are both unspecified.
this is basically the same in the draft C11 standard expect C11 does not list the exceptions so quoting C99 is more convenient. Paragraph 3 in C11 says:
The grouping of operators and operands is indicated by the syntax.85)
Except as specified later, side effects and value computations of
subexpressions are unsequenced.86)
Specifically for assignment operators C99 says:
The order of evaluation of the operands is unspecified [...]
and C11 says:
[...] The evaluations of the operands are unsequenced.
No, there is no such guarantee, N1570 §6.5.16/p3 (emphasis mine):
An assignment operator stores a value in the object designated by the
left operand. An assignment expression has the value of the left
operand after the assignment,111) but is not an lvalue. The type of an
assignment expression is the type the left operand would have after
lvalue conversion. The side effect of updating the stored value of the
left operand is sequenced after the value computations of the left and
right operands. The evaluations of the operands are unsequenced.
Note that assignment operator "consumes" two operands and has side-effect of modifying a lvalue.
I just encountered this today, and spent 1 hour debugging code like this:
int a[1], b=0;
a[b++] = b;
I expected a[0] to contain 0 after this, but the compiler actually decided to evaluate b++ first, then right side of assignment, and store result in a[0] (so b++ on the left side worked as it should). So this effectively became:
b++;
a[0] = b; // 1
This will depend on the precedence and associativity of the operators involved.
A full list can be found here
I know that order of computations in C is not strict, so value of expression --a + ++a is undefined because it's unknown which part of statement runs first.
But, what if I known that order of computations is irrelevant in a particular case? For example:
All modifications correspond to different variables (like in a[p1++] = b[p2++])
Order do not matter, like in a++ + ++a - the result is two no matter which side of + is calculated first. Is it guaranteed that one the parts will be calculated fully before running the another? I.e. compiler is unable to remember result of a++, the result of ++a and then apply first a++, getting one instead of two? For example, caching initial value of a and passing it as argument to two operators independently.
I'm interested in answers about C, C99, C11, C++03 and C++11, if there is any difference between all of them.
The standard says:
Between the previous and next sequence point an object shall have
its stored value modified at most once by the evaluation of an
expression. Furthermore, the prior value shall be accessed only to
determine the value to be stored. /26/
Except as indicated by the syntax /27/ or otherwise specified later
(for the function-call operator () , && , || , ?: , and comma
operators), the order of evaluation of subexpressions and the order in
which side effects take place are both unspecified.
So:
1.) a[p1++] = b[p2++]: It is guaranteed that the statement is evaluated correctly and gives the expected result. This is because each variable is modified only once and the result does not depend on the time when the actual increment of both variables is done.
2.) a++ + ++a: It is not guaranteed that the side effect (increment) is performed before the second usage of a. Hence this expression can give the value a + (a+1) or (a+1) + (a+1) or a + (a+2) depending on when your compiler performs the side effect increments of the original variable.
Online C 2011 standard:
6.5 Expressions
...
3 The grouping of operators and operands is indicated by the syntax.85) Except as specified
later, side effects and value computations of subexpressions are unsequenced.86)
85) The syntax specifies the precedence of operators in the evaluation of an expression, which is the same
as the order of the major subclauses of this subclause, highest precedence first. Thus, for example, the
expressions allowed as the operands of the binary + operator (6.5.6) are those expressions defined in
6.5.1 through 6.5.6. The exceptions are cast expressions (6.5.4) as operands of unary operators
(6.5.3), and an operand contained between any of the following pairs of operators: grouping
parentheses () (6.5.1), subscripting brackets [] (6.5.2.1), function-call parentheses () (6.5.2.2), and
the conditional operator ? : (6.5.15).
Within each major subclause, the operators have the same precedence. Left- or right-associativity is
indicated in each subclause by the syntax for the expressions discussed therein.
86) In an expression that is evaluated more than once during the execution of a program, unsequenced and
indeterminately sequenced evaluations of its subexpressions need not be performed consistently in
different evaluations.
Emphasis added.
There's no guarantee that the side effect of either a++ or ++a is applied before the other expression is evaluated, so you can get different results depending on the sequence of operations.
Here are several cases, assuming a starts out at 1:
Left to right evaluation, side effects applied immediately: (1) + (2+1) == 4
Left to right evaluation, side effects deferred: (1) + (1+1) == 3
Right to left evaluation, side effects applied immediately: (2) + (1+1) == 4
Right to left evaluation, side effects deferred: (1) + (1+1) == 3
Or any other combination.