Safety concerns about short circuit evaluation [duplicate] - c

This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
Is short-circuiting boolean operators mandated in C/C++? And evaluation order?
AFAIK Short circuit evaluation means that a boolean expression is evaluated only up to the point that we can guarantee its outcome.
This is a common idiom in perl where we can write things like:
(is_ok() returns non-zero value on "OK")
is_ok() || die "It's not OK!!\n";
instead of
if ( ! is_ok() ) {
die "It's not OK!!\n";
}
This only works because the order of evaluation is always left-to right and that guarantees that the rightmost statement is only executed if the first statement if not "false".
In C I can do something simillar like:
struct foo {
int some_flag;
} *ptr = 0;
/* do some work that may change value of ptr */
if ( 0!=ptr && ptr->some_flag ) {
/* do something */
}
Is it safe to use this kind of idiom?
Or is there any chance that the compiler may generate code that evaluates ptr->some_flag before making sure that ptr is not a zero pointer? (I am assuming that if it is non-null it points to some valid memory region).
This syntax is convenient to use because it saves typing without losing readability (in my opinion anyway). However I'm not sure if it is entirely safe which is why I'd like to learn more on this.
NB: If the compiler has an effect on this, I'm using gcc 4.x

The evaluation order of short-circuit operators (|| and &&) is guaranteed by the standard to be left to right (otherwise they would lose part of their usefulness).
§6.5.13 ¶4
Unlike the bitwise binary & operator, the && operator guarantees left-to-right evaluation;
there is a sequence point after the evaluation of the first operand. If the first operand
compares equal to 0, the second operand is not evaluated.
§6.5.14 ¶4
Unlike the bitwise | operator, the || operator guarantees left-to-right evaluation; there is
a sequence point after the evaluation of the first operand. If the first operand compares
unequal to 0, the second operand is not evaluated.

Or is there any chance that the compiler may generate code that
evaluates ptr->some_flag before making sure that ptr is not a zero
pointer?
No, zero-chance. It is guaranteed by the standard that ptr->some_flag will not be evaluated if the first operand is false.
6.5.13-4
Unlike the bitwise binary & operator,the && operator guarantees
left-to-right evaluation; there is a sequence point after the
evaluation of the first operand. If the first operand compares equal to
0, the second operand is not evaluated.

/* do some work that may change value of ptr */
if ( 0!=ptr && ptr->some_flag ) {
/* do something */
}
Is it safe to use this kind of idiom?
Yes.

It is safe in the above example, however anyone maintaining such code may not realise that there is a dependency on the order of evaluation and cause an interesting bug.

Related

Operator associativity with 'postfix decrement' and 'logical AND' operators in c

Disclaimer: I don't code like this, I'm just trying to understand how the c language works!!!!
The output is 12.
This expression (a-- == 10 && a-- == 9) evaluates left-to-right, and a is still 10 at a-- == 10 but a is 9 for a-- == 9.
1) Is there a clear rule as to when post-increment evaluate? From this example it seems it evaluates prior to the && but after the ==. Is that because the && logical operator makes a-- == 10 a complete expression, so a is updated after it executes?
2) Also for c/c++, certain operators such as prefix decrement occur right to left so a == --a first decrements a to 9 and then compares 9 == 9. Is there a reason for why c/c++ is designed this way? I know for Java, it's the opposite (it's evaluates left to right).
#include <stdio.h>
int main() {
int a = 10;
if (a-- == 10 && a-- == 9)
printf("1");
a = 10;
if (a == --a)
printf("2");
return 0;
}
The logical && operator contains a sequence point between the evaluation of the first and second operand. Part of this is that any side effect (such as that performed by the -- operator) as part of the left side is complete before the right side is evaluated.
This is detailed in section 6.5.13p4 of the C standard regarding the logical AND operator:
Unlike the bitwise binary & operator, the && operator guarantees
left-to-right evaluation; if the second operand is evaluated,
there is a sequence point between the evaluations of the
first and second operands. If the first operand compares
equal to 0, the second operand is not evaluated.
In the case of this expression:
(a-- == 10 && a-- == 9)
The current value of a (10) is first compared for equality against 10. This is true, so the right side is then evaluated, but not before the side effect of decrementing a that was done on the left side. Then, the current value of a (now 9) is compared for equality against 9. This is also true, so the whole expression evaluates to true. Before the next statement is executed, the side effect of decrementing a that was done on the right side is done.
This expression however:
if (a == --a)
Involves reading and writing a in the same expression without a sequence point. This invokes undefined behavior.
This expression (a-- == 10 && a-- == 9) evaluates left-to-right,
Yes, mostly, but only because && is special.
and a is still 10 at a-- == 10
Yes, because a-- yields the old value.
but a is 9 for a-- == 9.
Yes, because the sequence point at && guarantees the update to a's value is complete before the RHS is evaluated.
1) Is there a clear rule as to when post-increment evaluate?
The best answer, I think, is "no". The side effects due to ++ and -- are completed at some point prior to the next sequence point, but beyond that, you can't say. For well-defined expressions, it doesn't matter when the side effects are completed. If an expression is sensitive to when the side effect is completed, that usually means the expression is undefined.
From this example it seems it evaluates prior to the && but after the ==. Is that because the && logical operator makes a-- == 10 a complete expression, so a is updated after it executes?
Basically yes.
2) Also for c/c++, certain operators such as prefix decrement occur right to left
Careful. I'm not sure what you mean, but whatever it is, I'm almost certain it's not true.
so a == --a first decrements a to 9 and then compares 9 == 9.
No, a == --a is undefined. There's no telling what it does.
Is there a reason for why c/c++ is designed this way?
Yes.
I know for Java, it's the opposite (it's evaluates left to right).
Yes, Java is different.
Here are some guidelines to help you understand the evaluation of C expressions:
Learn the rules of operator precedence and associativity. For "simple" expressions, those rules tell you virtually everything you need to know about an expression's evaluation. Given a + b * c, b is multiplied by c and then the product added to a, because of the higher precedence of * over +. Given a + b + c, a is added to b and then the sum added to c, because+ associates from left to right.
With the exception of associativity (as mentioned in point 1), try not to use the words "left to right" or "right to left" evaluation at all. C has nothing like left to right or right to left evaluation. (Obviously Java is different.)
Where it gets tricky is side effects. (When I said "'simple' expressions" in point 1, I basically meant "expressions without side effects".) Side effects include (a) function calls, (b) assignments with =, (c) assignments with +=, -=, etc., and of course (d) increments/decrements with ++ and --. (If it matters when you fetch from a variable, which is typically only the case for variables qualified as volatile, we could add (e) fetches from volatile variables to the list.) In general, you can not tell when side effects happen. Try not to care. As long as you don't care (as long as your program is insensitive to the order in which side effects matter), it doesn't matter. But if your program is sensitive, it's probably undefined. (See more under points 4 and 5 below.)
You must never ever have two side effects in the same expression which attempt to alter the same variable. (Examples: i = i++, a++ + a++.) If you do, the expression is undefined.
With one class of exceptions, you must never ever have a side effect which attempts to alter a variable which is also being used elsewhere in the same expression. (Example: a == --a.) If you do, the expression is undefined. The exception is when the value accessed is being used to compute the value to be stored, as in i = i + 1.
With "logical and" operator (a-- == 10 && a-- == 9) is well-formed expression (without undefined behavior as it is in a++ + a++).
C standard says about "logical and"/"logical or" operators:
guarantees left-to-right evaluation; there is a sequence point after
the evaluation of the first operand.
So that, all side effects of the first subexpression a-- == 10 are complete before the second subexpression a-- == 9 evaluation.
a is 9 before evaluation of second subexpression.
The underlying problem is that the postfix unary operator has both a return value (the starting value of the variable) and a side effect (incrementing the variable). While the value has to be calculated in order, it is explicit in the C++ specs that the sequencing of any side effect relative to the rest of the operators in a statement is undefined, as long as it happens before the full expression completes. This allows compilers (and optimizers) to do what they want, including evaluating them differently on different expressions in the same program.
From the C++20 code spec (C++ 2020 draft N4849 is where I got this from):
Every value computation and side effect associated with a full-expression is
sequenced before every value computation and side effect associated with the
next full-expression to be evaluated.
[6.9.1 9, p.72]
If a side effect on a memory location is unsequenced
relative to either another side effect on the same memory location or
a value computation using the value of any object in the same memory location, and they are not potentially concurrent, the behavior is undefined. [6.9.1 10, p.72]
So, in case you haven't gotten it from other answers:
No, there is no defined order for a postfix operator. However, in your case (a-- == 10 && a-- == 9) has defined behavior because the && enforces that the left side must be evaluated before the right side. It will always return true, and at the end, a==8. Other operators or functions such as (a-- > a--) could get a lot of weird behavior, including a==9 at the end because both prefix operators store the original value of a(10) and decrement that value to 9 and store it back to a.
Not only is the side-effect of setting a=a-1 (in the prefix operator) unsequenced with the rest of this expression, the evaluation of the operands of == is also unsequenced. This expression could:
Evaluate a(10), then evaluate --a(9), then == (false), then set a=9.
Evaluate --a(9), then a(10), then set a=9, then evaluate == (false).
Evaluate --a(9) the set a=9, then evaluate a(9), then evaluate == (true)
Yes, it is very confusing. As a general rule (which I think you already know): Do not set a variable more than once in the same statement, or use it and set it in the same statement. You have no idea what the compiler is going to do with it, especially if this is code that will be published open source, so someone might compile it differently than you did.
Side note:
I've seen so many responses to questions about the undefined behavior of postfix operators that complain that this "undefined behavior" only ever occurs in the toy examples presented in the questions. This really annoys me, because it can and does happen. Here is a real example of how the behavior can change that I had actually happen to me in my legacy code base.
result[ctr]=source[ctr++];
result[ctr++]=(another calculated value without ctr in it);
In Visual Studio, under C++14, this evaluated so that result had every other value of source in even indexes and had the calculated values in the odd indexes. For example, for ctr=0, it would store source[0], copy the stored value to result[0], then increment ctr, then set result[1] to the calculated value, then increment ctr. (Yes, there was a reason for wanting that result.)
We updated to C++20, and this line started breaking. We ended up with a bad array, because it would store source[0], then increment ctr, then copy the stored value to result[1], then set result[1] to the calculated value, then increment ctr. It was only setting the odd indexes in result, first from source then overwriting the source value with the calculated value. All the odd indexes of result stayed zero (our original initialization value).
Ugh.

how do we interpret the `||` and `&&` in an assignment statement? [duplicate]

This question already has answers here:
Evaluation of C expression
(8 answers)
Closed 7 years ago.
I have been coding from a long time though I'm still a student programmer/ I'm usually good at programming but when questions like the one below are asked I get stuck. What will be the output and why of the following program?
int main()
{
int i=4,j=-1,k=0,w,x,y,z;
w=i||j||k;
print("%d",w);
return 0;
}
output:
1
why this result? what does the statement w=||j||k; means?
i || j || k is evaluated from left to right. It does that:
i == 4, which is true, so ORing it with any other value will yield true. That's it1.
The rest of the statement is not evaluated because || and && are short-circuit operators, that is, if in your statement i != 0, neither j nor k will be evaluated because the result is guaranteed to be 1. && works similarly.
That's important to remember if you have something like f() || k(), where k has some side effect like an output to screen or a variable assignment; it might not be executed at all.
The bitwise OR operator | really ORs the bitwise representations of the values instead; it evaluates all its operands.
1 Thanks to #SouravGosh on that!
In your code,
w=i||j||k;
is equivalent to
w= ((i||j) || k);
That means, first the (i||j) will be evaluated, and based on the result (if 0), the later part will be evaluated.
So, in your case, i being 4, (i||j) evaluates to 1 and based on the logical OR operator semantics, the later part is not evaluated and the whole expression yields 1 which is finally assigned to w.
Related quotes, from C11 standard, chapter §6.5.14, Logical OR operator
The || operator shall yield 1 if either of its operands compare unequal to 0; otherwise, it
yields 0. The result has type int.
then, regarding the evaluation of arguments,
[...] If the first operand compares unequal to 0, the second operand is
not evaluated.
and regarding the grouping,
[...] the || operator guarantees left-to-right evaluation;
The result of boolean operators yields an int value of 0 or 1.
See 6.5.13 and 6.5.14, paragraph 3.
The [...] operator shall yield 1 [or] 0. The result has type int.

Conditional execution based on short-circuit logical operation

As the evaluation of logical operators && and || are defined as "short circuit", I am assuming the following two pieces of code are equivalent:
p = c || do_something();
and
if (c) {
p = true;
}
else {
p = do_something();
}
given p and c are bool, and do_something() is a function returning bool and possibly having side effects. According to the C standard, can one rely on the assumption the snippets are equivalent? In particular, having the first snippet, is it promised that if c is true, the function won't be executed, and no side effects of it will take place?
After some search I will answer my question myself referencing the standard:
The C99 standard, section 6.5.14 Logical OR operator is stating:
Unlike the bitwise | operator, the || operator guarantees
left-to-right evaluation; there is a sequence point after the
evaluation of the first operand. If the first operand compares unequal
to 0, the second operand is not evaluated.
And a similar section about &&.
So the answer is yes, the code can be safely considered equivalent.
Yes, you are correct in your thinking. c || do_something() will short-circuit if c is true, and so will never call do_something().
However, if c is false, then do_something() will be called and its result will be the new value of p.

Order of logical OR execution in C

Was wondering if the next statement could lead to a protection fault and other horrible stuff if value of next is NULL(node being a linked list).
if (!node->next || node->next->some_field != some_value) {
Im assuming the second part of the OR is not evaluated once the first part is true. Am I wrong in assuming this? Is this compiler specific?
In the ISO-IEC-9899-1999 Standard (C99), Section 6.5.14:
The || operator shall yield 1 if either of its operands compare unequal
to 0; otherwise, it yields 0. The result has type int. 4 Unlike the
bitwise | operator, the || operator guarantees left-to-right evaluation;
there is a sequence point after the evaluation of the first operand.
If the first operand compares unequal to 0, the second operand is not
evaluated.
This is not compiler-specific. If node->next is NULL, then the rest of the condition is never evaluated.
In an OR,
if ( expr_1 || expr_2)
expr_2 only gets 'tested' when expr_1 fails (is false)
In an AND
if( expr_1 && expr_2 )
expr_2 only gets 'tested' when expr_1 succeeds (is true)
It is safe to assume that the right side boolean expression will not be evaluated if the left side evaluates to true. See relevant question.
It's not compiler specific. You can safely rely on short-circuiting and your code will work as expected.
You are correct.
It is compiler independent and always the first condition before OR operator(!node->next) is evaluated before evaluating the second condition(node->next->some_field != some_value) after OR operator. If the first condition is true, the entire expression just evaluates to true without evaluating the second condition.
You are just making the best use of this feature for your linked list. You are going further to access next pointer only if it is not NULL.

Short circuit evaluation and side effects [duplicate]

This question already has answers here:
Is short-circuiting logical operators mandated? And evaluation order?
(7 answers)
Closed 5 years ago.
OK, I'm a little embarassed to ask this question, but I just want to be sure...
It is known that C uses short circuit evaluation in boolean expressions:
int c = 0;
if (c && func(c)) { /* whatever... */ }
In that example func(c) is not called because c evaluates to 0. But how about more sophisticated example where side effects of comparison would change the variable being compared next? Like this:
int c; /* this is not even initialized... */
if (canInitWithSomeValue(&c) && c == SOMETHING) { /*...*/ }
Function canInitWithSomeValue returns true and changes value at given pointer in case of success. Is it guaranteed that subsequent comparisons (c == SOMETHING in this example) uses value set by canInitWithSomeValue(&c)?
No matter how heavy optimizations the compiler uses?
Is it guaranteed that subsequent comparisons (c == SOMETHING in this example) uses value set by canInitWithSomeValue(&c)?
Yes. Because there is a sequence point
Between evaluation of the left and right operands of the && (logical AND), || (logical OR), and comma operators. For example, in the expression *p++ != 0 && *q++ != 0, all side effects of the sub-expression *p++ != 0 are completed before any attempt to access q.
A sequence point defines any point in a computer program's execution at which it is guaranteed that all side effects of previous evaluations will have been performed, and no side effects from subsequent evaluations have yet been performed.
Yes. Because both && and || operator are also something called sequence points. The latter define when the side-effects of a previous operation should be complete and those of the next should not have begun.
Evaluation within the if statement composite condition is strictly left to right. The only circumstance under which the second test in your if would be optimized out is if the compiler can determine with 100% certainty that the first is identically equal to false.

Resources