Was wondering if the next statement could lead to a protection fault and other horrible stuff if value of next is NULL(node being a linked list).
if (!node->next || node->next->some_field != some_value) {
Im assuming the second part of the OR is not evaluated once the first part is true. Am I wrong in assuming this? Is this compiler specific?
In the ISO-IEC-9899-1999 Standard (C99), Section 6.5.14:
The || operator shall yield 1 if either of its operands compare unequal
to 0; otherwise, it yields 0. The result has type int. 4 Unlike the
bitwise | operator, the || operator guarantees left-to-right evaluation;
there is a sequence point after the evaluation of the first operand.
If the first operand compares unequal to 0, the second operand is not
evaluated.
This is not compiler-specific. If node->next is NULL, then the rest of the condition is never evaluated.
In an OR,
if ( expr_1 || expr_2)
expr_2 only gets 'tested' when expr_1 fails (is false)
In an AND
if( expr_1 && expr_2 )
expr_2 only gets 'tested' when expr_1 succeeds (is true)
It is safe to assume that the right side boolean expression will not be evaluated if the left side evaluates to true. See relevant question.
It's not compiler specific. You can safely rely on short-circuiting and your code will work as expected.
You are correct.
It is compiler independent and always the first condition before OR operator(!node->next) is evaluated before evaluating the second condition(node->next->some_field != some_value) after OR operator. If the first condition is true, the entire expression just evaluates to true without evaluating the second condition.
You are just making the best use of this feature for your linked list. You are going further to access next pointer only if it is not NULL.
Related
In an if statement with multiple conditionals, is the second conditional executed if the outcome of the first is clear?
example:
if(i>0 && array[i]==0){
}
If I swap the conditionals a segfault may occur for negative values of i, but this way no segfault occurs. Can I be sure that this always works or do have have to use nested if statements?
This type of evaluation is called short-circuiting.
Once the result is 100% clear, it does not continue evaluating.
This is actually a common programming technique.
For example, in C++ you will often see something like:
if (pX!=null && pX->predicate()) { bla bla bla }
If you changed the order of the conditions, you could be invoking a method on a null pointer and crashing. A similar example in C would use the field of a struct when you have a pointer to that struct.
You could do something similar with or:
if(px==null || pX->isEmpty()} { bla bla bla }
This is also one of the reasons that it is generally a good idea to avoid side effects in an if condition.
For example suppose you have:
if(x==4 && (++y>7) && z==9)
If x is 4, then y will be incremented regardless of the value of z or y, but if x is not 4, it will not be incremented at all.
The operators && and || guarantee that the left-hand side expression will be fully evaluated (and all side effects applied) before the right-hand side is evaluated. In other words, the operators introduce a sequence point.
Additionally, if the value of the expression can be determined from the lhs, the rhs is not evaluated. In other words, if you have an expression like x && y, and x evaluates to 0 (false), then the value of the expression is false regardless of y, so y is not evaluated.
This means that expressions like x++ && x++ are well-defined, since && introduces a sequence point.
From draft 3485 (n3485.pdf) Its clearly stated that
5.14 Logical AND operator [expr.log.and]
logical-and-expression:
inclusive-or-expression
logical-and-expression && inclusive-or-expression
The && operator groups left-to-right. The
operands are both contextually converted to bool (Clause 4). The
result is true if both operands are true and false otherwise. Unlike
&, && guarantees left-to-right evaluation: the second operand is not
evaluated if the first operand is false.
The result is a bool. If the
second expression is evaluated, every value computation and side
effect associated with the first expression is sequenced before every
value computation and side effect associated with the second
expression.
Disclaimer: I don't code like this, I'm just trying to understand how the c language works!!!!
The output is 12.
This expression (a-- == 10 && a-- == 9) evaluates left-to-right, and a is still 10 at a-- == 10 but a is 9 for a-- == 9.
1) Is there a clear rule as to when post-increment evaluate? From this example it seems it evaluates prior to the && but after the ==. Is that because the && logical operator makes a-- == 10 a complete expression, so a is updated after it executes?
2) Also for c/c++, certain operators such as prefix decrement occur right to left so a == --a first decrements a to 9 and then compares 9 == 9. Is there a reason for why c/c++ is designed this way? I know for Java, it's the opposite (it's evaluates left to right).
#include <stdio.h>
int main() {
int a = 10;
if (a-- == 10 && a-- == 9)
printf("1");
a = 10;
if (a == --a)
printf("2");
return 0;
}
The logical && operator contains a sequence point between the evaluation of the first and second operand. Part of this is that any side effect (such as that performed by the -- operator) as part of the left side is complete before the right side is evaluated.
This is detailed in section 6.5.13p4 of the C standard regarding the logical AND operator:
Unlike the bitwise binary & operator, the && operator guarantees
left-to-right evaluation; if the second operand is evaluated,
there is a sequence point between the evaluations of the
first and second operands. If the first operand compares
equal to 0, the second operand is not evaluated.
In the case of this expression:
(a-- == 10 && a-- == 9)
The current value of a (10) is first compared for equality against 10. This is true, so the right side is then evaluated, but not before the side effect of decrementing a that was done on the left side. Then, the current value of a (now 9) is compared for equality against 9. This is also true, so the whole expression evaluates to true. Before the next statement is executed, the side effect of decrementing a that was done on the right side is done.
This expression however:
if (a == --a)
Involves reading and writing a in the same expression without a sequence point. This invokes undefined behavior.
This expression (a-- == 10 && a-- == 9) evaluates left-to-right,
Yes, mostly, but only because && is special.
and a is still 10 at a-- == 10
Yes, because a-- yields the old value.
but a is 9 for a-- == 9.
Yes, because the sequence point at && guarantees the update to a's value is complete before the RHS is evaluated.
1) Is there a clear rule as to when post-increment evaluate?
The best answer, I think, is "no". The side effects due to ++ and -- are completed at some point prior to the next sequence point, but beyond that, you can't say. For well-defined expressions, it doesn't matter when the side effects are completed. If an expression is sensitive to when the side effect is completed, that usually means the expression is undefined.
From this example it seems it evaluates prior to the && but after the ==. Is that because the && logical operator makes a-- == 10 a complete expression, so a is updated after it executes?
Basically yes.
2) Also for c/c++, certain operators such as prefix decrement occur right to left
Careful. I'm not sure what you mean, but whatever it is, I'm almost certain it's not true.
so a == --a first decrements a to 9 and then compares 9 == 9.
No, a == --a is undefined. There's no telling what it does.
Is there a reason for why c/c++ is designed this way?
Yes.
I know for Java, it's the opposite (it's evaluates left to right).
Yes, Java is different.
Here are some guidelines to help you understand the evaluation of C expressions:
Learn the rules of operator precedence and associativity. For "simple" expressions, those rules tell you virtually everything you need to know about an expression's evaluation. Given a + b * c, b is multiplied by c and then the product added to a, because of the higher precedence of * over +. Given a + b + c, a is added to b and then the sum added to c, because+ associates from left to right.
With the exception of associativity (as mentioned in point 1), try not to use the words "left to right" or "right to left" evaluation at all. C has nothing like left to right or right to left evaluation. (Obviously Java is different.)
Where it gets tricky is side effects. (When I said "'simple' expressions" in point 1, I basically meant "expressions without side effects".) Side effects include (a) function calls, (b) assignments with =, (c) assignments with +=, -=, etc., and of course (d) increments/decrements with ++ and --. (If it matters when you fetch from a variable, which is typically only the case for variables qualified as volatile, we could add (e) fetches from volatile variables to the list.) In general, you can not tell when side effects happen. Try not to care. As long as you don't care (as long as your program is insensitive to the order in which side effects matter), it doesn't matter. But if your program is sensitive, it's probably undefined. (See more under points 4 and 5 below.)
You must never ever have two side effects in the same expression which attempt to alter the same variable. (Examples: i = i++, a++ + a++.) If you do, the expression is undefined.
With one class of exceptions, you must never ever have a side effect which attempts to alter a variable which is also being used elsewhere in the same expression. (Example: a == --a.) If you do, the expression is undefined. The exception is when the value accessed is being used to compute the value to be stored, as in i = i + 1.
With "logical and" operator (a-- == 10 && a-- == 9) is well-formed expression (without undefined behavior as it is in a++ + a++).
C standard says about "logical and"/"logical or" operators:
guarantees left-to-right evaluation; there is a sequence point after
the evaluation of the first operand.
So that, all side effects of the first subexpression a-- == 10 are complete before the second subexpression a-- == 9 evaluation.
a is 9 before evaluation of second subexpression.
The underlying problem is that the postfix unary operator has both a return value (the starting value of the variable) and a side effect (incrementing the variable). While the value has to be calculated in order, it is explicit in the C++ specs that the sequencing of any side effect relative to the rest of the operators in a statement is undefined, as long as it happens before the full expression completes. This allows compilers (and optimizers) to do what they want, including evaluating them differently on different expressions in the same program.
From the C++20 code spec (C++ 2020 draft N4849 is where I got this from):
Every value computation and side effect associated with a full-expression is
sequenced before every value computation and side effect associated with the
next full-expression to be evaluated.
[6.9.1 9, p.72]
If a side effect on a memory location is unsequenced
relative to either another side effect on the same memory location or
a value computation using the value of any object in the same memory location, and they are not potentially concurrent, the behavior is undefined. [6.9.1 10, p.72]
So, in case you haven't gotten it from other answers:
No, there is no defined order for a postfix operator. However, in your case (a-- == 10 && a-- == 9) has defined behavior because the && enforces that the left side must be evaluated before the right side. It will always return true, and at the end, a==8. Other operators or functions such as (a-- > a--) could get a lot of weird behavior, including a==9 at the end because both prefix operators store the original value of a(10) and decrement that value to 9 and store it back to a.
Not only is the side-effect of setting a=a-1 (in the prefix operator) unsequenced with the rest of this expression, the evaluation of the operands of == is also unsequenced. This expression could:
Evaluate a(10), then evaluate --a(9), then == (false), then set a=9.
Evaluate --a(9), then a(10), then set a=9, then evaluate == (false).
Evaluate --a(9) the set a=9, then evaluate a(9), then evaluate == (true)
Yes, it is very confusing. As a general rule (which I think you already know): Do not set a variable more than once in the same statement, or use it and set it in the same statement. You have no idea what the compiler is going to do with it, especially if this is code that will be published open source, so someone might compile it differently than you did.
Side note:
I've seen so many responses to questions about the undefined behavior of postfix operators that complain that this "undefined behavior" only ever occurs in the toy examples presented in the questions. This really annoys me, because it can and does happen. Here is a real example of how the behavior can change that I had actually happen to me in my legacy code base.
result[ctr]=source[ctr++];
result[ctr++]=(another calculated value without ctr in it);
In Visual Studio, under C++14, this evaluated so that result had every other value of source in even indexes and had the calculated values in the odd indexes. For example, for ctr=0, it would store source[0], copy the stored value to result[0], then increment ctr, then set result[1] to the calculated value, then increment ctr. (Yes, there was a reason for wanting that result.)
We updated to C++20, and this line started breaking. We ended up with a bad array, because it would store source[0], then increment ctr, then copy the stored value to result[1], then set result[1] to the calculated value, then increment ctr. It was only setting the odd indexes in result, first from source then overwriting the source value with the calculated value. All the odd indexes of result stayed zero (our original initialization value).
Ugh.
I was reading my textbook for my computer architecture class and I came across this statement.
A second important distinction between the logical operators '&&' and '||' versus their bit-level counterparts '&' and '|' is that the logical operators do not evaluate their second argument if the result of the expression can be determined by evaluating the first argument. Thus, for example, the expression a && 5/a will never cause a division by zero, and the expression p && *p++ will never cause the dereferencing of a null pointer. (Computer Systems: A Programmer's Perspective by Bryant and O'Hallaron, 3rd Edition, p. 57)
My question is why do logical operators in C behave like that? Using the author's example of a && 5/a, wouldn't C need to evaluate the whole expression because && requires both predicates to be true? Without loss of generality, my same question applies to his second example.
Short-circuiting is a performance enhancement that happens to be useful for other purposes.
You say "wouldn't C need to evaluate the whole expression because && requires both predicates to be true?" But think about it. If the left hand side of the && is false, does it matter what the right hand side evaluates to? false && true or false && false, the result is the same: false.
So when the left hand side of an && is determined to be false, or the left hand side of a || is determined to be true, the value on the right doesn't matter, and can be skipped. This makes the code faster by removing the need to evaluate a potentially expensive second test. Imagine if the right-hand side called a function that scanned a whole file for a given string? Wouldn't you want that test skipped if the first test meant you already knew the combined answer?
C decided to go beyond guaranteeing short-circuiting to guaranteeing order of evaluation because it means safety tests like the one you provide are possible. As long as the tests are idempotent, or the side-effects are intended to occur only when not short-circuited, this feature is desirable.
A typical example is a null-pointer check:
if(ptr != NULL && ptr->value) {
....
}
Without short-circuit-evaluation, this would cause an error when the null-pointer is dereferenced.
The program first checks the left part ptr != NULL. If this evaluates to false, it does not have to evaluate the second part, because it is already clear that the result will be false.
In the expression X && Y, if X is evaluated to false, then we know that X && Y will always be false, whatever is the value of Y. Therefore, there is no need to evaluate Y.
This trick is used in your example to avoid a division by 0. If a is evaluated to false (i.e. a == 0), then we do not evaluate 5/a.
It can also save a lot of time. For instance, when evaluating f() && g(), if the call to g() is expensive and if f() returns false, not evaluating g() is a nice feature.
wouldn't C need to evaluate the whole expression because && requires both predicates to be true?
Answer: No. Why work more when the answer is known "already"?
As per the definition of the logical AND operator, quoting C11, chapter §6.5.14
The && operator shall yield 1 if both of its operands compare unequal to 0; otherwise, it
yields 0.
Following that analogy, for an expression of the form a && b, in case a evaluates to FALSE, irrespective of the evaluated result of b, the result will be FALSE. No need to waste machine cycle checking the b and then returning FALSE, anyway.
Same goes for logical OR operator, too, in case the first argument evaluates to TRUE, the return value condition is already found, and no need to evaluate the second argument.
It's just the rule and is extremely useful. Perhaps that's why it's the rule. It means we can write clearer code. An alternative - using if statements would produce much more verbose code since you can't use if statements directly within expressions.
You already give one example. Another is something like if (a && b / a) to prevent integer division by zero, the behaviour of which is undefined in C. That's what the author is guarding themselves from in writing a && 5/a.
Very occasionally if you do always need both arguments evaluated (perhaps they call functions with side effects), you can always use & and |.
This question already has answers here:
Is short-circuiting logical operators mandated? And evaluation order?
(7 answers)
Closed 9 years ago.
In C have the following:
return (abc(1) || abc(2));
If abc(1 == 1) returns true will then call abc(2)?
No, it won't. This is called "short-circuiting" and it is a common flow-control mechanism:
With a && b, b is only evaluated if a is true; if a is false, the whole expression must necessarily be false.
With a || b, b is only evaluated if a is false; if a is false, the whole expression may still be true.
No. It's guaranteed (by The Standard), that if abc(1) returns true, abc(2) will NOT be called.
If abc(1) returns false, that it's guaranteed, that abc(2) WILL be called.
It's similar with &&: if you have abc(1) && abc(2), abc(2) will be called ONLY IF abc(1) return true and will NOT be called, if abc(1) return false.
The idea behind this is:
true OR whatever -> true
false OR whatever -> whatever
false AND whatever -> false
true AND whatever -> whatever
This comes from the boolean algebra
If abc(1==1) returns true will then call abc(2) ?
No, it won't. This behavior is known as short-circuiting. It is guaranteed by C and C++ standards.
C11(n1570), § 6.5.13 Logical AND operator
Unlike the bitwise binary & operator, the && operator guarantees left-to-right evaluation; if the second operand is evaluated, there is a sequence point between the evaluations of
the first and second operands. If the first operand compares equal to 0, the second operand is not evaluated.
(Emphasis is mine.)
The same applies to the || operator.
abc(2) will be called only if abc(1) is false
According to C99 specification, logical OR operator says
The || operator guarantees left-to-right evaluation; there is a sequence point after the evaluation of the first operand. If the first operand compares unequal to 0, the second operand is not evaluated.
|| (logical comparison) breaks further checking, while | (bitwise comparison) doesn't.
You might also read: Difference between | and || or & and && for comparison
nop, the second abc(2) is called only if the left statement is false
In C, the logical || operator is tested from left-to-right, guaranteed. For the whole statement to be true, either condition can be true. So || keeps going from left to right until one condition is true, and then stops (or it gets to the end). So no, if abc(1) returns true then abc(2) will not be called.
Contrast with &&, which keeps going from left to right until one condition is false (or it gets to the end).
No. This is actually non trivial and defined in standard that logical operators are evaluated from left to right. Evaluation is stopped when the value can be determined with out further evaluation of operands. At least I am 100% sure for AND and OR.
This is non trivial problem, because therefore the evaluation of operands cannot be implicitly parallelized or optimized by reorganization of order, as the expected outcome could differ.
E.g. run-time troubles in wide-use cases such as if (*ptr && (ptr->number > other_number) )
This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
Is short-circuiting boolean operators mandated in C/C++? And evaluation order?
AFAIK Short circuit evaluation means that a boolean expression is evaluated only up to the point that we can guarantee its outcome.
This is a common idiom in perl where we can write things like:
(is_ok() returns non-zero value on "OK")
is_ok() || die "It's not OK!!\n";
instead of
if ( ! is_ok() ) {
die "It's not OK!!\n";
}
This only works because the order of evaluation is always left-to right and that guarantees that the rightmost statement is only executed if the first statement if not "false".
In C I can do something simillar like:
struct foo {
int some_flag;
} *ptr = 0;
/* do some work that may change value of ptr */
if ( 0!=ptr && ptr->some_flag ) {
/* do something */
}
Is it safe to use this kind of idiom?
Or is there any chance that the compiler may generate code that evaluates ptr->some_flag before making sure that ptr is not a zero pointer? (I am assuming that if it is non-null it points to some valid memory region).
This syntax is convenient to use because it saves typing without losing readability (in my opinion anyway). However I'm not sure if it is entirely safe which is why I'd like to learn more on this.
NB: If the compiler has an effect on this, I'm using gcc 4.x
The evaluation order of short-circuit operators (|| and &&) is guaranteed by the standard to be left to right (otherwise they would lose part of their usefulness).
§6.5.13 ¶4
Unlike the bitwise binary & operator, the && operator guarantees left-to-right evaluation;
there is a sequence point after the evaluation of the first operand. If the first operand
compares equal to 0, the second operand is not evaluated.
§6.5.14 ¶4
Unlike the bitwise | operator, the || operator guarantees left-to-right evaluation; there is
a sequence point after the evaluation of the first operand. If the first operand compares
unequal to 0, the second operand is not evaluated.
Or is there any chance that the compiler may generate code that
evaluates ptr->some_flag before making sure that ptr is not a zero
pointer?
No, zero-chance. It is guaranteed by the standard that ptr->some_flag will not be evaluated if the first operand is false.
6.5.13-4
Unlike the bitwise binary & operator,the && operator guarantees
left-to-right evaluation; there is a sequence point after the
evaluation of the first operand. If the first operand compares equal to
0, the second operand is not evaluated.
/* do some work that may change value of ptr */
if ( 0!=ptr && ptr->some_flag ) {
/* do something */
}
Is it safe to use this kind of idiom?
Yes.
It is safe in the above example, however anyone maintaining such code may not realise that there is a dependency on the order of evaluation and cause an interesting bug.