This question already has answers here:
Is short-circuiting logical operators mandated? And evaluation order?
(7 answers)
Closed 5 years ago.
I am curious about the exact definition of the C standard about variable assignment within a conditional clause. Here is a small example:
int foo() {
...
}
int bar() {
...
}
int main () {
int a, b, x, y;
...
if ( (a = foo()) == x && (b = bar() == y ) {
...
}
...
}
A test with GCC revealed that if (a = foo()) != x, b = bar() will not be executed. On the one hand this behavior is optimal, as it will not waste any time with the calculation of bar(). On the other hand, though, the value of b is somewhat undefined, because it depends on the result of foo(), which actually has nothing to do with b.
I wonder if there is an explicit definition for such cases in the C standard and what the reasons for that definition would be. And finally, what is considered to be best practice for writing such code?
A test with GCC revealed that if (a = foo()) != x, b = bar() will not be executed.
Well, that's the property of the logical AND (&&) operator.
Quoting C11, chapter ยง6.5.13, emphasis mine
Unlike the bitwise binary & operator, the && operator guarantees left-to-right evaluation;
if the second operand is evaluated, there is a sequence point between the evaluations of
the first and second operands. If the first operand compares equal to 0, the second
operand is not evaluated.
That said, for the assignment of b, you're right, if the first operand is FALSE, the value of b remains indeterminate and any usage of the variable may lead to undefined behavior.
(One of the) The best practice, is to initialize local variables upon definition to avoid use of indeterminate value.
The operator && consists of short circuit evaluation: the second subexpression is only evaluated if its result is needed in order to determine the result of the whole expression.
In your example, in the expression:
(a = foo()) == x && (b = bar() == y)
if the first subexpression (a = foo()) == x is evaluated to false, the result of the expression above is also false regardless of the result that the subexpression (b = bar() == y) would produce.
Because of both short-circuit evaluation and the fact that (a = foo()) == x evaluates to false, the expression (b = bar() == y) is never evaluated.
It's very simple: the second (i.e. right hand) argument of && will not be evaluated if the first argument evaluates to 0.
So, (b = bar() == y ) will not be evaluated if (a = foo()) == x is 0.
The fact that this could leave b uninitialised - with potentially undefined results - is a touchstone for the fact that you ought not code this way.
You could use
if ((a = foo() == x) & (b = bar() == y))
if you want both sides to always be evaluated: & is not short-circuited. But beware of unexpected effects if the arguments can evaluate to something other than 0 or 1 (which they don't in this case).
Related
Hi just wondering if you use a chained assigment in an if condition, would the leftmost variable be used to check the if condition
like a=b=c , its a thats ultimetly checked and not b or c
#include <stdio.h>
int main()
{
int a, b, c =0;
// does this reduce to a == 100 and the variables b or c are not checked if they are == to 100 but simply assigned the value of 100 ?
if( (a = b = c = 100) == 100)
printf( "a is 100 \n");
return 0;
}
The expression is not actually checking a or b or c.
An assignment expression, like any expression, has a value. And in this case it is the value that is stored. However, the actual storing of the value in an object is a side effect so there's no guarantee that it has happened at the time the comparison operator is evaluated.
So the condition is actually more like:
if (100 == 100)
With the assignment to a, b, and c happening in a manner that is unsequenced with respect to the comparison.
This is spelled out in section 6.5.16p3 of the C standard regarding assignment operators:
An assignment operator stores a value in the object designated by the left operand. An assignment expression has the value of the left operand after the assignment, but is not an lvalue. The type of an assignment expression is the type the left operand would have after lvalue conversion. The side effect of updating the stored value of the left operand is sequenced after the value computations of the left and right operands. The evaluations of the operands are unsequenced.
The condition is always true. Your code is equivalent to:
a = 100;
b = 100;
c = 100;
printf( "a is 100 \n");
I have three variables: a, b, c. Let's say they are integers. I want to find the first non-zero value among them, in that particular order, without looping. The following seems to work, but I am not sure if that is because I am lucky, or because the language guarantees it:
int main(int argc, char *argv[]) {
int a = 0;
int b = 3;
int c = 5;
int test;
if ((test = a) != 0 || (test = b) != 0 || (test = c) != 0) {
printf("First non-zero: %d\n", test);
} else {
printf("All zero!\n");
}
return 0;
}
Is the repeated assignment with short-circuiting shown here guaranteed to work as intended, or am I missing something?
This might be one place where a three-letter answer would be acceptable, but a two-letter answer might require more explanation.
It would!
Because of the nature of the OR operator if any of the condition is
true then the test stops.
Thus i think what you did was basically equivalent to:
test = a != 0 ? a : b != 0 ? b : c != 0 ? c : 0;
printf("%d\n",test);
but heck yours looks good.
[update]
As per what chqrlie mentioned it can be further simplified to:
test = a ? a : b ? b : c;
Yes, your expression is fully defined because there is a sequence point at each || operator and the short circuit evaluation guarantees that the first non zero value assigned to test completes the expression.
Here is a crazy alternative without sequence points that may produce branchless code:
int test = a + !!a * (b + !!b * c);
printf("%d\n", test);
The code is very bad practice but it is guaranteed to work fine.
This is because the || and && operators have special characteristics - unlike most operators in C, they guarantee that the evaluation of the left operand is sequenced (executed) before the evaluation of the right operand. This is the reason that the code works. There's also a guarantee that the right operand will not be evaluated if it is sufficient to evaluate the left one ("short circuit"). Summarized in C17 6.5.14/4:
Unlike the bitwise | operator, the || operator guarantees left-to-right evaluation; if the
second operand is evaluated, there is a sequence point between the evaluations of the first
and second operands. If the first operand compares unequal to 0, the second operand is
not evaluated.
"Sequence point" being the key here, which is what gives the expression a deterministic outcome.
Had you used pretty much any other operator (like for example bitwise |), then the result would be undefined, because you have multiple side effects (assignments) on the same variable test in the same expression.
A more sound version of the same algorithm would involve storing the data in an array and loop through it.
#include <stdio.h>
int main() {
int a = 1;
int b = a || (a | a) && a++;
printf("%d %d\n", a, b);
return 0;
}
when I ran this code the results were 1 and 1.
According to the C language Operator Precedence the operation && is supposed to happen before the operation ||. So shouldn't the result be 2 1 ? (a = 2, b = 1)
when OR'ing expressions in C, a shortcut is taken, I.E. as soon as an expression is evaluated to TRUE, the rest of the OR'd expressions are not evaluated
The first expression a evaluates to TRUE, so all the rest of the expressions are not evaluated, so a is never incremented
When applying the operator precedence rules, the expression is equivalent to:
int b = a || ((a | a) && a++);
The evaluation or the operands to || and && is performed from left to right and shortcut evaluation prevents evaluating the right operand if the left operand can determine the result: since a is non zero, a || anything evaluates to 1 without evaluating the right operand, hence bypassing the a++ side effect.
Therefore both a and b have value 1 and the program prints 1 1.
Conversely, if you had written int b = (a || ((a | a)) && a++;, the left operand of && would have value 1, so the right operand need to be evaluated. a++ evaluates to 1 but increments a, so b have final value 1 and a is set to 2, producing your expected result.
The confusion comes from equating operator precedence with order of evaluation. These are two separate notions: operator precedence determines the order in which to apply the operators, but does not determine the order of evaluation of their operands.
Only four operators have a specified order of evaluation of their operands: &&, ||, ? : and , and the first 2 may skip evaluation of one, depending on the value of the other operand and the third only evaluates the first and only one among the second and third operands. For other operators, the order of evaluation of the operands is unspecified, and it may differ from one compiler to another, one expression to another or even one run to another, although unlikely.
When I try to put a || (a | a) into a parenthesis. The result was as you expected.
So I guess that when C compiler executes an OR operator and it get a True value to the OR, the execution will finish immediately.
In your case, when the compiler executed the a || (a | a) (1 || something) operation. Value of b will be declared to 1 right away and a++ operator won't be execute.
In the follwing code,
int a = 1, b = 2, c = 3, d;
d = a++ && b++ || c++;
printf("%d\n", c);
The output will be 3 and I get that or evaluates first condition, sees it as 1 and then doesn't care about the other condition but in c, unary operators have a higher precedence than logical operators and like in maths
2 * 3 + 3 * 4
we would evaluate the above expression by first evaluating product and then the summation, why doesn't c do the same? First evaluate all the unary operators, and then the logical thing?
Please realize that precedence is not the same concept as order of evaluation. The special behavior of && and || says that the right-hand side is not evaluated at all if it doesn't have to be. Precedence tells you something about how it would be evaluated if it were evaluated.
Stated another way, precedence helps describe how to parse an expression. But it does not directly say how to evaluate it. Precedence tells us that the way to parse the expression you asked about is:
||
/ \
/ \
&& c++
/ \
/ \
a++ b++
But then when we go to evaluate this parse tree, the short-circuiting behavior of && and || tells us that if the left-hand side determines the outcome, we don't go down the right-hand side and evaluate anything at all. In this case, since a++ && b++ is true, the || operator knows that its result is going to be 1, so it doesn't cause the c++ part to be evaluated at all.
That's also why conditional expressions like
if(p != NULL && *p != '\0')
and
if(n == 0 || sum / n == 0)
are safe. The first one will not crash, will not attempt to access *p, in the case where p is NULL. The second one will not divide by 0 if n is 0.
It's very easy to get the wrong impression abut precedence and order of evaluation. When we have an expression like
1 + 2 * 3
we always say things like "the higher precedence of * over + means that the multiplication happens first". But what if we throw in some function calls, like this:
f() + g() * h()
Which of those three functions is going to get called first? It turns out we have no idea. Precedence doesn't tell us that. The compiler could arrange to call f() first, even though its result is needed last. See also this answer.
I don't know if anyone could kindly explain this code for me?
unsigned int x = 0;
(x ^= x ) || x++ || ++x || x++;
printf("%d\n", x);
when I compile this on my computer using gcc 4.2, the output is 2.
Originally i thought maybe this behavior is unspecified but then i figure || will have lower precedence over other operators, so shouldn't the answer be 3? Since there are three "++".
Can someone explain? Thanks
(x ^= x) is evaluated and it yields 0, therefore:
(x++) is evaluated and it yields 0, therefore:
(++x) is evaluated and it yields 2, therefore it stops
It all boils down to one rule: || only evaluates its right side if its left side is false.
The issue is that the || operator is short-circuiting. As soon as it finds a true value, it no longer needs to check the remaining || statements; the answer is already known.
(x ^= x) evaluates to 0.
x++ evaluates to 0, then increments x to 1.
++x evaluates to 2 -- true.
The final or statement does not need to be computed. It "short-circuits" and immediately returns true.
The behaviour is well-defined. You are observing the short-circuiting behaviour of ||; the final x++ is never evaluated.
That is short-circuit semantics in action. The first expression x ^= x evaluates to 0, the second evaluates to 0 as well. The third one evaluates to 2, and then the logical expression is short-circuited since its result its already determined to be true.
Of course you shouldn't use such constructs, but let us analyze the expression.
There are 4 expressions combined with shortcut-OR:
a || b || c || d
b, c and d are only evaluated, if a is false, c and d only if b is false too, and d only if all from the before are false.
Ints are evaluated as 0 == false, everything else is not false.
x ^= 0
with x being 0 is 0 again.
x++
is evaluated and later increased, so it evaluates to 0, which invokes expression c, but later x will be increased.
++x
is first incremented, leading to 1, and then evaluated (leading to 1) which is the reason, why d is not evaluated, but the increment of b is pending, so we get 2.
But I'm not sure, whether such behaviour is exactly defined and leads to the same result on all compilers.
Avoid it.
The expressions between the || operators will be evaluated left to right until one is true:
(x ^= x )
Sets all bits in x to 0/off. (false);
x++
Increments x, it's now 1, but still false because this was a post increment.
++x
(pre-)Increments x, which is now 2 and also true, so there is no need for the right hand side of || to be evaluated.