Imagine i have the following piece of C-code where foo() produces a side effect and returns an integer:
if(bar) {
foo();
return 0;
}
Now, say I really like making my code compact, possibly at the reader's expense, and I change it into this:
if (bar)
return foo() && 0;
Can I be sure these two pieces of code will produce the same behavior, or would I risk the call to foo() not being executed due to possible compiler optimizations or something like that, thus not producing the desired side-effect?
NOTE: This is not a question about which piece of code is better, but whether the two pieces actually produce the same behavior in all cases. I think the majority (and I) can agree that the former piece of code should be used.
Yes, those two are the same. foo() will always be called (assuming bar is true).
The two forms you give are equivalent. The C11 standard (draft n1570) states,
6.5.13 Logical AND operator
...
Semantics
3 The && operator shall yield 1 if both of its operands compare unequal to 0;
otherwise, it yields 0. The result has type int.
4 Unlike the bitwise binary & operator, the && operator guarantees left-to-right
evaluation; if the second operand is evaluated, there is a sequence point between
the evaluations of the first and second operands. If the first operand compares
equal to 0, the second operand is not evaluated.
Similar language appeared in all C standards so far.
You should probably prefer using the comma operator here (return foo(), 0;) because:
It's shorter (one character versus two for the operator, and you can get away with removing the left space character when using a comma, for a total of two fewer characters).
It gives you more flexibility, as you can return non-scalar types (such as structs), and a wider range of integers than just 0 or 1.
It conveys the intent better: "Discard return value of foo() and return something else (0) instead".
Now if you do chance upon a compiler that deletes the call to foo(), then either the compiler managed to prove that foo() is a function with no visible side-effects, or more likely it has a serious bug and you should report it.
Why obfuscate your code in the latter?
Use the former.
Easier to read i.e. this is easier to understand
if(bar) {
foo();
return 0;
}
Or unless got a problem with job security
Related
I am a beginner in C and I came across a code which went like this:
#include<stdio.h>
int main()
{
int k=35;
printf("%d %d %d",k==35,k=50,k>40);
return 0;
}
I tried to find the output without writing the code in my computer and 'actually' running it first. So here's how I figured out what the output might be:
In the precedence table, among assignment (==), equality (=) and greater than (>) operators, the greater than operator will be implemented first. So, initially, k=35 and thus k>40 is false (the integer thus will be 0). Next up is the equality operator (==). Now, k==35 is true initially. So this will return an integer 1. Finally, the assignment operator (=) will do its job, set the value of k to 50(and ultimately, returning 50) and the program will exit. Thus, based on this logic, I 'guessed' that final output will be like:
1 50 0
Now I ran the code in my IDE (I'm coming to my IDE a bit later) and the result it gave was:
0 50 0
So, my confusion is, in which order are the operators being implemented?
Any help or hints are very much welcome.
NOTE: I noticed a bit closer and found that the only way possible is: First the > operator does its job, then the = operator and finally, the == operator.
I changed my printf line a bit and wrote:
printf("%d %d %d",k==35,k>40,k=50);
Here, I found that the output was:
0 1 50
which, again was not according to what I 'guessed' initially with my logic.
I tried all the possible ordering of the syntax inside the printf() function. After looking at all those outputs, I doubt: Is the program being implemented in a right-to-left order here irrespective of the ordering in the precedence table?
NOTE: Regarding my IDE, I use Dev C++ 5.11. I use it as our instructor has advised us to use it for our course. As this IDE is not so much well-received among many others, I tried an online compiler, but the results are the same.
The behavior is undefined - the result is not guaranteed to be predictable or repeatable.
First, precedence only controls which operators are grouped with which operands - it does not control the order in which expressions are evaluated.
Second, the order in which function arguments are evaluated is unspecified - they can be evaluated right-to-left, left-to-right, or any other order1.
Finally, you are updating k (k = 50) and attempting to use it in a value computation (k == 35, k > 40) without an intervening sequence point, which is explicitly called out by the language definition as undefined behavior.
So the result can literally be anything, even what you expect it to be. You can’t rely on it to be consistent or predictable.
This is not the same as the order in which they are passed to the function.
printf("%d %d %d",k==35,k=50,k>40);
This is undefined behavior.
The order of evaluation of the arguments is simply unspecified.
See here and here for further explanation.
Order of evaluation of the operands of any C operator, including the order of evaluation of function arguments in a function-call expression, and the order of evaluation of the subexpressions within any expression is unspecified[...]. The compiler will evaluate them in any order, and may choose another order when the same expression is evaluated again.
I would like to ask if this short code:
int i = 0;
has 1 operand or 2? The i is an operand, but is 0 too? According to wikipedia, 0 shouldn't (or maybe I misunderstand). If 0 isn't operand, is it a constant or something?
If it is important, the code is in C99.
In int i = 0;, = is not an operator. It's simply a part of the variable initializaton syntax. On the other hand, in int i; i = 0; it would be an operator.
Since = here is not an operator, there are no operands. Instead, 0 is the initializer.
Since you've tagged the question as "C", one way to look at it is by reading the C standard. As pointed out by this answer, initialization (such as int i = 0) is not an expression in itself, and by a strict reading of the standard, the = here is not an operator according to the usage of those terms in the standard.
It is not as clear whether i and 0 are operands, however. On one hand, the C standard does not seem to refer to the parts of the initialization as operands. On the other hand, it doesn't define the term "operand" exhaustively. For example, one could interpret section 6.3 as calling almost any non-operator part of an expression an "operand", and therefore at least the 0 would qualify as one.
(Also note that if the code was int i; i = 0; instead, the latter i and the 0 would definitely both be operands of the assignment operator =. It remains unclear whether the intent of the question was to make a distinction between assignment and initialization.)
Apart from the C standard, you also refer to Wikipedia, in particular:
In computing, an operand is the part of a computer instruction which specifies what data is to be manipulated or operated on, while at the same time representing the data itself.
If we consider the context of a "computer instruction", the C code might naively be translate to assembly code like mov [ebp-4], 0 where the two operands would clearly be the [ebp-4] (a location where the variable called i is stored) and the 0, which would make both i and 0 operands by this definition. Yet, in reality the code is likely to be optimized by the compiler into a different form, such as only storing i in a register, in which case zeroing it might become xor eax, eax where the 0 no longer exists as an explicit operand (but is the result of the operation). Or, the whole 0 might be optimized away and replaced by some different value that inevitably gets assigned. Or, the whole variable might end up being removed, e.g., if it is used as a loop counter and the loop is unrolled.
So, in the end, either it is something of a philosophical question ("does the zero exist as an operand if it gets optimized away"), or just a matter of deciding on the desired usage of the terms (perhaps depending on the context of discussion).
The i is an operand, but is 0 too? According to wikipedia, 0 shouldn't
(or maybe I misunderstand).
The question links to the Wikipedia page describing the term "operand" in a mathematical context. This likely factors in to your confusion about whether the 0 on the right-hand side of the = is an operand, because in a mathematical statement of equality such as appears in the article, it is in no way conventional to consider = an operator. It does not express a computation, so the expressions it relates are not operated upon. I.e. they do not function as operands.
You have placed the question in C context, however, and in C, = can be used as an assignment operator. This is a bona fide operator in that it expresses a (simple) operation that produces a result from two operands, and also has a side effect on the left-hand operand. In an assignment statement such as
i = 0;
, then, both i and 0 are operands.
With respect to ...
If 0 isn't operand, is it a constant or something?
... 0 standing on its own is a constant in C, but that has little to do with whether it serves as an operand in any given context. Being an operand is a way an expression, including simply 0, can be used. More on that in a moment.
Now it's unclear whether it was the intent, but the code actually presented in the question is very importantly different from the assignment statement above. As the other answers also observe,
int i = 0;
is a declaration, not a statement. In that context, the i is the identifier being declared, the 0 is used as its initializer, and just as the = in a mathematical equation is not an operator, the = introducing an initializer in a C declaration is not an operator either. No operation is performed here, in that no value computation is associated with this =, as opposed to when the same symbol is used as an assignment operator. There being no operator and no operation being performed, there also are no operands in this particular line of code.
I was reading my textbook for my computer architecture class and I came across this statement.
A second important distinction between the logical operators '&&' and '||' versus their bit-level counterparts '&' and '|' is that the logical operators do not evaluate their second argument if the result of the expression can be determined by evaluating the first argument. Thus, for example, the expression a && 5/a will never cause a division by zero, and the expression p && *p++ will never cause the dereferencing of a null pointer. (Computer Systems: A Programmer's Perspective by Bryant and O'Hallaron, 3rd Edition, p. 57)
My question is why do logical operators in C behave like that? Using the author's example of a && 5/a, wouldn't C need to evaluate the whole expression because && requires both predicates to be true? Without loss of generality, my same question applies to his second example.
Short-circuiting is a performance enhancement that happens to be useful for other purposes.
You say "wouldn't C need to evaluate the whole expression because && requires both predicates to be true?" But think about it. If the left hand side of the && is false, does it matter what the right hand side evaluates to? false && true or false && false, the result is the same: false.
So when the left hand side of an && is determined to be false, or the left hand side of a || is determined to be true, the value on the right doesn't matter, and can be skipped. This makes the code faster by removing the need to evaluate a potentially expensive second test. Imagine if the right-hand side called a function that scanned a whole file for a given string? Wouldn't you want that test skipped if the first test meant you already knew the combined answer?
C decided to go beyond guaranteeing short-circuiting to guaranteeing order of evaluation because it means safety tests like the one you provide are possible. As long as the tests are idempotent, or the side-effects are intended to occur only when not short-circuited, this feature is desirable.
A typical example is a null-pointer check:
if(ptr != NULL && ptr->value) {
....
}
Without short-circuit-evaluation, this would cause an error when the null-pointer is dereferenced.
The program first checks the left part ptr != NULL. If this evaluates to false, it does not have to evaluate the second part, because it is already clear that the result will be false.
In the expression X && Y, if X is evaluated to false, then we know that X && Y will always be false, whatever is the value of Y. Therefore, there is no need to evaluate Y.
This trick is used in your example to avoid a division by 0. If a is evaluated to false (i.e. a == 0), then we do not evaluate 5/a.
It can also save a lot of time. For instance, when evaluating f() && g(), if the call to g() is expensive and if f() returns false, not evaluating g() is a nice feature.
wouldn't C need to evaluate the whole expression because && requires both predicates to be true?
Answer: No. Why work more when the answer is known "already"?
As per the definition of the logical AND operator, quoting C11, chapter §6.5.14
The && operator shall yield 1 if both of its operands compare unequal to 0; otherwise, it
yields 0.
Following that analogy, for an expression of the form a && b, in case a evaluates to FALSE, irrespective of the evaluated result of b, the result will be FALSE. No need to waste machine cycle checking the b and then returning FALSE, anyway.
Same goes for logical OR operator, too, in case the first argument evaluates to TRUE, the return value condition is already found, and no need to evaluate the second argument.
It's just the rule and is extremely useful. Perhaps that's why it's the rule. It means we can write clearer code. An alternative - using if statements would produce much more verbose code since you can't use if statements directly within expressions.
You already give one example. Another is something like if (a && b / a) to prevent integer division by zero, the behaviour of which is undefined in C. That's what the author is guarding themselves from in writing a && 5/a.
Very occasionally if you do always need both arguments evaluated (perhaps they call functions with side effects), you can always use & and |.
Yes, this is a homework question, but I've done my research and a fair amount of deep thought on the topic and can't figure this out. The question states that this piece of code does NOT exhibit short-circuit behavior and asks why. But it looks to me like it does exhibit short-circuit behavior, so can someone explain why it doesn't?
In C:
int sc_and(int a, int b) {
return a ? b : 0;
}
It looks to me that in the case that a is false, the program will not try to evaluate b at all, but I must be wrong. Why does the program even touch b in this case, when it doesn't have to?
This is a trick question. b is an input argument to the sc_and method, and so will always be evaluated. In other-words sc_and(a(), b()) will call a() and call b() (order not guaranteed), then call sc_and with the results of a(), b() which passes to a?b:0. It has nothing to do with the ternary operator itself, which would absolutely short-circuit.
UPDATE
With regards to why I called this a 'trick question': It's because of the lack of well-defined context for where to consider 'short circuiting' (at least as reproduced by the OP). Many persons, when given just a function definition, assume that the context of the question is asking about the body of the function; they often do not consider the function as an expression in and of itself. This is the 'trick' of the question; To remind you that in programming in general, but especially in languages like C-likes that often have many exceptions to rules, you can't do that. Example, if the question was asked as such:
Consider the following code. Will sc_and exibit short-circuit behavior when called from main:
int sc_and(int a, int b){
return a?b:0;
}
int a(){
cout<<"called a!"<<endl;
return 0;
}
int b(){
cout<<"called b!"<<endl;
return 1;
}
int main(char* argc, char** argv){
int x = sc_and(a(), b());
return 0;
}
It would be immediately clear that you're supposed to be thinking of sc_and as an operator in and of itself in your own domain-specific language, and evaluating if the call to sc_and exhibits short-circuit behavior like a regular && would. I would not consider that to be a trick question at all, because it's clear you're not supposed to focus on the ternary operator, and are instead supposed to focus on C/C++'s function-call mechanics (and, I would guess, lead nicely into a follow-up question to write an sc_and that does short-circuit, which would involve using a #define rather than a function).
Whether or not you call what the ternary operator itself does short-circuiting (or something else, like 'conditional evaluation') depends on your definition of short-circuiting, and you can read the various comments for thoughts on that. By mine it does, but it's not terribly relevant to the actual question or why I called it a 'trick'.
When the statement
bool x = a && b++; // a and b are of int type
executes, b++ will not be evaluated if the operand a evaluated to false (short circuit behavior). This means that the side-effect on b will not take place.
Now, look at the function:
bool and_fun(int a, int b)
{
return a && b;
}
and call this
bool x = and_fun(a, b++);
In this case, whether a is true or false, b++ will always be evaluated1 during function call and side effect on b will always take place.
Same is true for
int x = a ? b : 0; // Short circuit behavior
and
int sc_and (int a, int b) // No short circuit behavior.
{
return a ? b : 0;
}
1 Order of evaluation of function arguments are unspecified.
As already pointed out by others, no matter what gets pass into the function as the two arguments, it gets evaluated as it gets passed in. That is way before the tenary operation.
On the other hand, this
#define sc_and(a, b) \
((a) ?(b) :0)
would "short-circuit", as this macro does not imply a function call and with this no evaluation of a function's argument(s) is performed.
Edited to correct the errors noted in #cmasters comment.
In
int sc_and(int a, int b) {
return a ? b : 0;
}
... the returned expression does exhibit short-circuit evaluation, but the function call does not.
Try calling
sc_and (0, 1 / 0);
The function call evaluates 1 / 0, though it is never used, hence causing - probably - a divide by zero error.
Relevant excerpts from the (draft) ANSI C Standard are:
2.1.2.3 Program execution
...
In the abstract machine, all expressions are evaluated as specified by
the semantics. An actual implementation need not evaluate part of an
expression if it can deduce that its value is not used and that no
needed side effects are produced (including any caused by calling a
function or accessing a volatile object).
and
3.3.2.2 Function calls
....
Semantics
...
In preparing for the call to a function, the arguments are evaluated,
and each parameter is assigned the value of the corresponding
argument.
My guess is that each argument is evaluated as an expression, but that the argument list as a whole is not an expression, hence the non-SCE behaviour is mandatory.
As a splasher on the surface of the deep waters of the C standard, I'd appreciate a properly informed view on two aspects:
Does evaluating 1 / 0 produce undefined behaviour?
Is an argument list an expression? (I think not)
P.S.
Even you move to C++, and define sc_and as an inline function, you will not get SCE. If you define it as a C macro, as #alk does, you certainly will.
To clearly see ternary op short circuiting try changing the code slightly to use function pointers instead of integers:
int a() {
printf("I'm a() returning 0\n");
return 0;
}
int b() {
printf("And I'm b() returning 1 (not that it matters)\n");
return 1;
}
int sc_and(int (*a)(), int (*b)()) {
a() ? b() : 0;
}
int main() {
sc_and(a, b);
return 0;
}
And then compile it (even with almost NO optimization: -O0!). You will see b() is not executed if a() returns false.
% gcc -O0 tershort.c
% ./a.out
I'm a() returning 0
%
Here the generated assembly looks like:
call *%rdx <-- call a()
testl %eax, %eax <-- test result
je .L8 <-- skip if 0 (false)
movq -16(%rbp), %rdx
movl $0, %eax
call *%rdx <- calls b() only if not skipped
.L8:
So as others correctly pointed out the question trick is to make you focus on the ternary operator behaviour that DOES short circuit (call that 'conditional evaluation') instead of the parameter evaluation on call (call by value) that DOES NOT short circuit.
The C ternary operator can never short-circuit, because it only evaluates a single expression a (the condition), to determine a value given by expressions b and c, if any value might be returned.
The following code:
int ret = a ? b : c; // Here, b and c are expressions that return a value.
It's almost equivalent to the following code:
int ret;
if(a) {ret = b} else {ret = c}
The expression a may be formed by other operators like && or || that can short circuit because they may evaluate two expressions before returning a value, but that would not be considered as the ternary operator doing short-circuit but the operators used in the condition as it does in a regular if statement.
Update:
There is some debate about the ternary operator being a short-circuit operator. The argument says any operator that doesn't evaluate all it's operands does short-circuit according to #aruisdante in the comment below. If given this definition, then the ternary operator would be short-circuiting and in the case this is the original definition I agree. The problem is that the term "short-circuit" was originally used for a specific kind of operator that allowed this behavior and those are the logic/boolean operators, and the reason why are only those is what I'll try to explain.
Following the article Short-circuit Evaluation, the short-circuit evaluation is only referred to boolean operators implemented into the language in a way where knowing that the first operand will make the second irrelevant, this is, for the && operator being the first operand false, and for the || operator being the first operand true, the C11 spec also notes it in 6.5.13 Logical AND operator and 6.5.14 Logical OR operator.
This means that for the short-circuit behavior to be identified, you would expect to identify it in an operator that must evaluate all operands just like the boolean operators if the first operand doesn't make irrelevant the second. This is in line with what is written in another definition for the short-circuit in MathWorks under the "Logical short-circuiting" section, since short-circuiting comes from the logical operators.
As I've been trying to explain the C ternary operator, also called ternary if, only evaluates two of the operands, it evaluates the first one, and then evaluates a second one, either one of the two remaining depending on the value of the first one. It always does this, its not supposed to be evaluating all three in any situation, so there is no "short-circuit" in any case.
As always, if you see something is not right, please write a comment with an argument against this and not just a downvote, that just makes the SO experience worse, and I believe we can be a much better community that one that just downvotes answers one does not agree with.
I was writing a console application that would try to "guess" a number by trial and error, it worked fine and all but it left me wondering about a certain part that I wrote absentmindedly,
The code is:
#include <stdio.h>
#include <stdlib.h>
int main()
{
int x,i,a,cc;
for(;;){
scanf("%d",&x);
a=50;
i=100/a;
for(cc=0;;cc++)
{
if(x<a)
{
printf("%d was too big\n",a);
a=a-((100/(i<<=1))?:1);
}
else if (x>a)
{
printf("%d was too small\n",a);
a=a+((100/(i<<=1))?:1);
}
else
{
printf("%d was the right number\n-----------------%d---------------------\n",a,cc);
break;
}
}
}
return 0;
}
More specifically the part that confused me is
a=a+((100/(i<<=1))?:1);
//Code, code
a=a-((100/(i<<=1))?:1);
I used ((100/(i<<=1))?:1) to make sure that if 100/(i<<=1) returned 0 (or false) the whole expression would evaluate to 1 ((100/(i<<=1))?:***1***), and I left the part of the conditional that would work if it was true empty ((100/(i<<=1))? _this space_ :1), it seems to work correctly but is there any risk in leaving that part of the conditional empty?
This is a GNU C extension (see ?: wikipedia entry), so for portability you should explicitly state the second operand.
In the 'true' case, it is returning the result of the conditional.
The following statements are almost equivalent:
a = x ?: y;
a = x ? x : y;
The only difference is in the first statement, x is always evaluated once, whereas in the second, x will be evaluated twice if it is true. So the only difference is when evaluating x has side effects.
Either way, I'd consider this a subtle use of the syntax... and if you have any empathy for those maintaining your code, you should explicitly state the operand. :)
On the other hand, it's a nice little trick for a common use case.
This is a GCC extension to the C language. When nothing appears between ?:, then the value of the comparison is used in the true case.
The middle operand in a conditional expression may be omitted. Then if the first operand is nonzero, its value is the value of the conditional expression.
Therefore, the expression
x ? : y
has the value of x if that is nonzero; otherwise, the value of y.
This example is perfectly equivalent to
x ? x : y
In this simple case, the ability to omit the middle operand is not especially useful. When it becomes useful is when the first operand does, or may (if it is a macro argument), contain a side effect. Then repeating the operand in the middle would perform the side effect twice. Omitting the middle operand uses the value already computed without the undesirable effects of recomputing it.