Short circuit evaluation and side effects [duplicate]

Short circuit evaluation and side effects [duplicate] - c

This question already has answers here:
Is short-circuiting logical operators mandated? And evaluation order?
(7 answers)
Closed 5 years ago.
OK, I'm a little embarassed to ask this question, but I just want to be sure...
It is known that C uses short circuit evaluation in boolean expressions:
int c = 0;
if (c && func(c)) { /* whatever... */ }
In that example func(c) is not called because c evaluates to 0. But how about more sophisticated example where side effects of comparison would change the variable being compared next? Like this:
int c; /* this is not even initialized... */
if (canInitWithSomeValue(&c) && c == SOMETHING) { /*...*/ }
Function canInitWithSomeValue returns true and changes value at given pointer in case of success. Is it guaranteed that subsequent comparisons (c == SOMETHING in this example) uses value set by canInitWithSomeValue(&c)?
No matter how heavy optimizations the compiler uses?

Is it guaranteed that subsequent comparisons (c == SOMETHING in this example) uses value set by canInitWithSomeValue(&c)?
Yes. Because there is a sequence point
Between evaluation of the left and right operands of the && (logical AND), || (logical OR), and comma operators. For example, in the expression *p++ != 0 && *q++ != 0, all side effects of the sub-expression *p++ != 0 are completed before any attempt to access q.
A sequence point defines any point in a computer program's execution at which it is guaranteed that all side effects of previous evaluations will have been performed, and no side effects from subsequent evaluations have yet been performed.

Yes. Because both && and || operator are also something called sequence points. The latter define when the side-effects of a previous operation should be complete and those of the next should not have begun.

Evaluation within the if statement composite condition is strictly left to right. The only circumstance under which the second test in your if would be optimized out is if the compiler can determine with 100% certainty that the first is identically equal to false.

Related

Operator associativity with 'postfix decrement' and 'logical AND' operators in c

Disclaimer: I don't code like this, I'm just trying to understand how the c language works!!!!
The output is 12.
This expression (a-- == 10 && a-- == 9) evaluates left-to-right, and a is still 10 at a-- == 10 but a is 9 for a-- == 9.
1) Is there a clear rule as to when post-increment evaluate? From this example it seems it evaluates prior to the && but after the ==. Is that because the && logical operator makes a-- == 10 a complete expression, so a is updated after it executes?
2) Also for c/c++, certain operators such as prefix decrement occur right to left so a == --a first decrements a to 9 and then compares 9 == 9. Is there a reason for why c/c++ is designed this way? I know for Java, it's the opposite (it's evaluates left to right).
#include <stdio.h>
int main() {
int a = 10;
if (a-- == 10 && a-- == 9)
printf("1");
a = 10;
if (a == --a)
printf("2");
return 0;
}

The logical && operator contains a sequence point between the evaluation of the first and second operand. Part of this is that any side effect (such as that performed by the -- operator) as part of the left side is complete before the right side is evaluated.
This is detailed in section 6.5.13p4 of the C standard regarding the logical AND operator:
Unlike the bitwise binary & operator, the && operator guarantees
left-to-right evaluation; if the second operand is evaluated,
there is a sequence point between the evaluations of the
first and second operands. If the first operand compares
equal to 0, the second operand is not evaluated.
In the case of this expression:
(a-- == 10 && a-- == 9)
The current value of a (10) is first compared for equality against 10. This is true, so the right side is then evaluated, but not before the side effect of decrementing a that was done on the left side. Then, the current value of a (now 9) is compared for equality against 9. This is also true, so the whole expression evaluates to true. Before the next statement is executed, the side effect of decrementing a that was done on the right side is done.
This expression however:
if (a == --a)
Involves reading and writing a in the same expression without a sequence point. This invokes undefined behavior.

This expression (a-- == 10 && a-- == 9) evaluates left-to-right,
Yes, mostly, but only because && is special.
and a is still 10 at a-- == 10
Yes, because a-- yields the old value.
but a is 9 for a-- == 9.
Yes, because the sequence point at && guarantees the update to a's value is complete before the RHS is evaluated.
1) Is there a clear rule as to when post-increment evaluate?
The best answer, I think, is "no". The side effects due to ++ and -- are completed at some point prior to the next sequence point, but beyond that, you can't say. For well-defined expressions, it doesn't matter when the side effects are completed. If an expression is sensitive to when the side effect is completed, that usually means the expression is undefined.
From this example it seems it evaluates prior to the && but after the ==. Is that because the && logical operator makes a-- == 10 a complete expression, so a is updated after it executes?
Basically yes.
2) Also for c/c++, certain operators such as prefix decrement occur right to left
Careful. I'm not sure what you mean, but whatever it is, I'm almost certain it's not true.
so a == --a first decrements a to 9 and then compares 9 == 9.
No, a == --a is undefined. There's no telling what it does.
Is there a reason for why c/c++ is designed this way?
Yes.
I know for Java, it's the opposite (it's evaluates left to right).
Yes, Java is different.
Here are some guidelines to help you understand the evaluation of C expressions:
Learn the rules of operator precedence and associativity. For "simple" expressions, those rules tell you virtually everything you need to know about an expression's evaluation. Given a + b * c, b is multiplied by c and then the product added to a, because of the higher precedence of * over +. Given a + b + c, a is added to b and then the sum added to c, because+ associates from left to right.
With the exception of associativity (as mentioned in point 1), try not to use the words "left to right" or "right to left" evaluation at all. C has nothing like left to right or right to left evaluation. (Obviously Java is different.)
Where it gets tricky is side effects. (When I said "'simple' expressions" in point 1, I basically meant "expressions without side effects".) Side effects include (a) function calls, (b) assignments with =, (c) assignments with +=, -=, etc., and of course (d) increments/decrements with ++ and --. (If it matters when you fetch from a variable, which is typically only the case for variables qualified as volatile, we could add (e) fetches from volatile variables to the list.) In general, you can not tell when side effects happen. Try not to care. As long as you don't care (as long as your program is insensitive to the order in which side effects matter), it doesn't matter. But if your program is sensitive, it's probably undefined. (See more under points 4 and 5 below.)
You must never ever have two side effects in the same expression which attempt to alter the same variable. (Examples: i = i++, a++ + a++.) If you do, the expression is undefined.
With one class of exceptions, you must never ever have a side effect which attempts to alter a variable which is also being used elsewhere in the same expression. (Example: a == --a.) If you do, the expression is undefined. The exception is when the value accessed is being used to compute the value to be stored, as in i = i + 1.

With "logical and" operator (a-- == 10 && a-- == 9) is well-formed expression (without undefined behavior as it is in a++ + a++).
C standard says about "logical and"/"logical or" operators:
guarantees left-to-right evaluation; there is a sequence point after
the evaluation of the first operand.
So that, all side effects of the first subexpression a-- == 10 are complete before the second subexpression a-- == 9 evaluation.
a is 9 before evaluation of second subexpression.

The underlying problem is that the postfix unary operator has both a return value (the starting value of the variable) and a side effect (incrementing the variable). While the value has to be calculated in order, it is explicit in the C++ specs that the sequencing of any side effect relative to the rest of the operators in a statement is undefined, as long as it happens before the full expression completes. This allows compilers (and optimizers) to do what they want, including evaluating them differently on different expressions in the same program.
From the C++20 code spec (C++ 2020 draft N4849 is where I got this from):
Every value computation and side effect associated with a full-expression is
sequenced before every value computation and side effect associated with the
next full-expression to be evaluated.
[6.9.1 9, p.72]
If a side effect on a memory location is unsequenced
relative to either another side effect on the same memory location or
a value computation using the value of any object in the same memory location, and they are not potentially concurrent, the behavior is undefined. [6.9.1 10, p.72]
So, in case you haven't gotten it from other answers:
No, there is no defined order for a postfix operator. However, in your case (a-- == 10 && a-- == 9) has defined behavior because the && enforces that the left side must be evaluated before the right side. It will always return true, and at the end, a==8. Other operators or functions such as (a-- > a--) could get a lot of weird behavior, including a==9 at the end because both prefix operators store the original value of a(10) and decrement that value to 9 and store it back to a.
Not only is the side-effect of setting a=a-1 (in the prefix operator) unsequenced with the rest of this expression, the evaluation of the operands of == is also unsequenced. This expression could:
Evaluate a(10), then evaluate --a(9), then == (false), then set a=9.
Evaluate --a(9), then a(10), then set a=9, then evaluate == (false).
Evaluate --a(9) the set a=9, then evaluate a(9), then evaluate == (true)
Yes, it is very confusing. As a general rule (which I think you already know): Do not set a variable more than once in the same statement, or use it and set it in the same statement. You have no idea what the compiler is going to do with it, especially if this is code that will be published open source, so someone might compile it differently than you did.
Side note:
I've seen so many responses to questions about the undefined behavior of postfix operators that complain that this "undefined behavior" only ever occurs in the toy examples presented in the questions. This really annoys me, because it can and does happen. Here is a real example of how the behavior can change that I had actually happen to me in my legacy code base.
result[ctr]=source[ctr++];
result[ctr++]=(another calculated value without ctr in it);
In Visual Studio, under C++14, this evaluated so that result had every other value of source in even indexes and had the calculated values in the odd indexes. For example, for ctr=0, it would store source[0], copy the stored value to result[0], then increment ctr, then set result[1] to the calculated value, then increment ctr. (Yes, there was a reason for wanting that result.)
We updated to C++20, and this line started breaking. We ended up with a bad array, because it would store source[0], then increment ctr, then copy the stored value to result[1], then set result[1] to the calculated value, then increment ctr. It was only setting the odd indexes in result, first from source then overwriting the source value with the calculated value. All the odd indexes of result stayed zero (our original initialization value).
Ugh.

Why do logical operators in C not evaluate the entire expression when it's not necessary to?

I was reading my textbook for my computer architecture class and I came across this statement.
A second important distinction between the logical operators '&&' and '||' versus their bit-level counterparts '&' and '|' is that the logical operators do not evaluate their second argument if the result of the expression can be determined by evaluating the first argument. Thus, for example, the expression a && 5/a will never cause a division by zero, and the expression p && *p++ will never cause the dereferencing of a null pointer. (Computer Systems: A Programmer's Perspective by Bryant and O'Hallaron, 3rd Edition, p. 57)
My question is why do logical operators in C behave like that? Using the author's example of a && 5/a, wouldn't C need to evaluate the whole expression because && requires both predicates to be true? Without loss of generality, my same question applies to his second example.

Short-circuiting is a performance enhancement that happens to be useful for other purposes.
You say "wouldn't C need to evaluate the whole expression because && requires both predicates to be true?" But think about it. If the left hand side of the && is false, does it matter what the right hand side evaluates to? false && true or false && false, the result is the same: false.
So when the left hand side of an && is determined to be false, or the left hand side of a || is determined to be true, the value on the right doesn't matter, and can be skipped. This makes the code faster by removing the need to evaluate a potentially expensive second test. Imagine if the right-hand side called a function that scanned a whole file for a given string? Wouldn't you want that test skipped if the first test meant you already knew the combined answer?
C decided to go beyond guaranteeing short-circuiting to guaranteeing order of evaluation because it means safety tests like the one you provide are possible. As long as the tests are idempotent, or the side-effects are intended to occur only when not short-circuited, this feature is desirable.

A typical example is a null-pointer check:
if(ptr != NULL && ptr->value) {
....
}
Without short-circuit-evaluation, this would cause an error when the null-pointer is dereferenced.
The program first checks the left part ptr != NULL. If this evaluates to false, it does not have to evaluate the second part, because it is already clear that the result will be false.

In the expression X && Y, if X is evaluated to false, then we know that X && Y will always be false, whatever is the value of Y. Therefore, there is no need to evaluate Y.
This trick is used in your example to avoid a division by 0. If a is evaluated to false (i.e. a == 0), then we do not evaluate 5/a.
It can also save a lot of time. For instance, when evaluating f() && g(), if the call to g() is expensive and if f() returns false, not evaluating g() is a nice feature.

wouldn't C need to evaluate the whole expression because && requires both predicates to be true?
Answer: No. Why work more when the answer is known "already"?
As per the definition of the logical AND operator, quoting C11, chapter §6.5.14
The && operator shall yield 1 if both of its operands compare unequal to 0; otherwise, it
yields 0.
Following that analogy, for an expression of the form a && b, in case a evaluates to FALSE, irrespective of the evaluated result of b, the result will be FALSE. No need to waste machine cycle checking the b and then returning FALSE, anyway.
Same goes for logical OR operator, too, in case the first argument evaluates to TRUE, the return value condition is already found, and no need to evaluate the second argument.

It's just the rule and is extremely useful. Perhaps that's why it's the rule. It means we can write clearer code. An alternative - using if statements would produce much more verbose code since you can't use if statements directly within expressions.
You already give one example. Another is something like if (a && b / a) to prevent integer division by zero, the behaviour of which is undefined in C. That's what the author is guarding themselves from in writing a && 5/a.
Very occasionally if you do always need both arguments evaluated (perhaps they call functions with side effects), you can always use & and |.

how do we interpret the `||` and `&&` in an assignment statement? [duplicate]

This question already has answers here:
Evaluation of C expression
(8 answers)
Closed 7 years ago.
I have been coding from a long time though I'm still a student programmer/ I'm usually good at programming but when questions like the one below are asked I get stuck. What will be the output and why of the following program?
int main()
{
int i=4,j=-1,k=0,w,x,y,z;
w=i||j||k;
print("%d",w);
return 0;
}
output:
1
why this result? what does the statement w=||j||k; means?

i || j || k is evaluated from left to right. It does that:
i == 4, which is true, so ORing it with any other value will yield true. That's it1.
The rest of the statement is not evaluated because || and && are short-circuit operators, that is, if in your statement i != 0, neither j nor k will be evaluated because the result is guaranteed to be 1. && works similarly.
That's important to remember if you have something like f() || k(), where k has some side effect like an output to screen or a variable assignment; it might not be executed at all.
The bitwise OR operator | really ORs the bitwise representations of the values instead; it evaluates all its operands.
1 Thanks to #SouravGosh on that!

In your code,
w=i||j||k;
is equivalent to
w= ((i||j) || k);
That means, first the (i||j) will be evaluated, and based on the result (if 0), the later part will be evaluated.
So, in your case, i being 4, (i||j) evaluates to 1 and based on the logical OR operator semantics, the later part is not evaluated and the whole expression yields 1 which is finally assigned to w.
Related quotes, from C11 standard, chapter §6.5.14, Logical OR operator
The || operator shall yield 1 if either of its operands compare unequal to 0; otherwise, it
yields 0. The result has type int.
then, regarding the evaluation of arguments,
[...] If the first operand compares unequal to 0, the second operand is
not evaluated.
and regarding the grouping,
[...] the || operator guarantees left-to-right evaluation;

The result of boolean operators yields an int value of 0 or 1.
See 6.5.13 and 6.5.14, paragraph 3.
The [...] operator shall yield 1 [or] 0. The result has type int.

Conditional execution based on short-circuit logical operation

As the evaluation of logical operators && and || are defined as "short circuit", I am assuming the following two pieces of code are equivalent:
p = c || do_something();
and
if (c) {
p = true;
}
else {
p = do_something();
}
given p and c are bool, and do_something() is a function returning bool and possibly having side effects. According to the C standard, can one rely on the assumption the snippets are equivalent? In particular, having the first snippet, is it promised that if c is true, the function won't be executed, and no side effects of it will take place?

After some search I will answer my question myself referencing the standard:
The C99 standard, section 6.5.14 Logical OR operator is stating:
Unlike the bitwise | operator, the || operator guarantees
left-to-right evaluation; there is a sequence point after the
evaluation of the first operand. If the first operand compares unequal
to 0, the second operand is not evaluated.
And a similar section about &&.
So the answer is yes, the code can be safely considered equivalent.

Yes, you are correct in your thinking. c || do_something() will short-circuit if c is true, and so will never call do_something().
However, if c is false, then do_something() will be called and its result will be the new value of p.

Safety concerns about short circuit evaluation [duplicate]

This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
Is short-circuiting boolean operators mandated in C/C++? And evaluation order?
AFAIK Short circuit evaluation means that a boolean expression is evaluated only up to the point that we can guarantee its outcome.
This is a common idiom in perl where we can write things like:
(is_ok() returns non-zero value on "OK")
is_ok() || die "It's not OK!!\n";
instead of
if ( ! is_ok() ) {
die "It's not OK!!\n";
}
This only works because the order of evaluation is always left-to right and that guarantees that the rightmost statement is only executed if the first statement if not "false".
In C I can do something simillar like:
struct foo {
int some_flag;
} *ptr = 0;
/* do some work that may change value of ptr */
if ( 0!=ptr && ptr->some_flag ) {
/* do something */
}
Is it safe to use this kind of idiom?
Or is there any chance that the compiler may generate code that evaluates ptr->some_flag before making sure that ptr is not a zero pointer? (I am assuming that if it is non-null it points to some valid memory region).
This syntax is convenient to use because it saves typing without losing readability (in my opinion anyway). However I'm not sure if it is entirely safe which is why I'd like to learn more on this.
NB: If the compiler has an effect on this, I'm using gcc 4.x

The evaluation order of short-circuit operators (|| and &&) is guaranteed by the standard to be left to right (otherwise they would lose part of their usefulness).
§6.5.13 ¶4
Unlike the bitwise binary & operator, the && operator guarantees left-to-right evaluation;
there is a sequence point after the evaluation of the first operand. If the first operand
compares equal to 0, the second operand is not evaluated.
§6.5.14 ¶4
Unlike the bitwise | operator, the || operator guarantees left-to-right evaluation; there is
a sequence point after the evaluation of the first operand. If the first operand compares
unequal to 0, the second operand is not evaluated.

Or is there any chance that the compiler may generate code that
evaluates ptr->some_flag before making sure that ptr is not a zero
pointer?
No, zero-chance. It is guaranteed by the standard that ptr->some_flag will not be evaluated if the first operand is false.
6.5.13-4
Unlike the bitwise binary & operator,the && operator guarantees
left-to-right evaluation; there is a sequence point after the
evaluation of the ﬁrst operand. If the ﬁrst operand compares equal to
0, the second operand is not evaluated.

/* do some work that may change value of ptr */
if ( 0!=ptr && ptr->some_flag ) {
/* do something */
}
Is it safe to use this kind of idiom?
Yes.

It is safe in the above example, however anyone maintaining such code may not realise that there is a dependency on the order of evaluation and cause an interesting bug.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

Short circuit evaluation and side effects [duplicate] - c

Yes. Because both && and || operator are also something called sequence points. The latter define when the side-effects of a previous operation should be complete and those of the next should not have begun.

Evaluation within the if statement composite condition is strictly left to right. The only circumstance under which the second test in your if would be optimized out is if the compiler can determine with 100% certainty that the first is identically equal to false.

Related

Operator associativity with 'postfix decrement' and 'logical AND' operators in c

Why do logical operators in C not evaluate the entire expression when it's not necessary to?

how do we interpret the `||` and `&&` in an assignment statement? [duplicate]

Conditional execution based on short-circuit logical operation

Safety concerns about short circuit evaluation [duplicate]

Categories

Resources