?: ternary conditional operator behaviour when leaving one expression empty - c

I was writing a console application that would try to "guess" a number by trial and error, it worked fine and all but it left me wondering about a certain part that I wrote absentmindedly,
The code is:
#include <stdio.h>
#include <stdlib.h>
int main()
{
int x,i,a,cc;
for(;;){
scanf("%d",&x);
a=50;
i=100/a;
for(cc=0;;cc++)
{
if(x<a)
{
printf("%d was too big\n",a);
a=a-((100/(i<<=1))?:1);
}
else if (x>a)
{
printf("%d was too small\n",a);
a=a+((100/(i<<=1))?:1);
}
else
{
printf("%d was the right number\n-----------------%d---------------------\n",a,cc);
break;
}
}
}
return 0;
}
More specifically the part that confused me is
a=a+((100/(i<<=1))?:1);
//Code, code
a=a-((100/(i<<=1))?:1);
I used ((100/(i<<=1))?:1) to make sure that if 100/(i<<=1) returned 0 (or false) the whole expression would evaluate to 1 ((100/(i<<=1))?:***1***), and I left the part of the conditional that would work if it was true empty ((100/(i<<=1))? _this space_ :1), it seems to work correctly but is there any risk in leaving that part of the conditional empty?

This is a GNU C extension (see ?: wikipedia entry), so for portability you should explicitly state the second operand.
In the 'true' case, it is returning the result of the conditional.
The following statements are almost equivalent:
a = x ?: y;
a = x ? x : y;
The only difference is in the first statement, x is always evaluated once, whereas in the second, x will be evaluated twice if it is true. So the only difference is when evaluating x has side effects.
Either way, I'd consider this a subtle use of the syntax... and if you have any empathy for those maintaining your code, you should explicitly state the operand. :)
On the other hand, it's a nice little trick for a common use case.

This is a GCC extension to the C language. When nothing appears between ?:, then the value of the comparison is used in the true case.
The middle operand in a conditional expression may be omitted. Then if the first operand is nonzero, its value is the value of the conditional expression.
Therefore, the expression
    x ? : y
has the value of x if that is nonzero; otherwise, the value of y.
This example is perfectly equivalent to
    x ? x : y
In this simple case, the ability to omit the middle operand is not especially useful. When it becomes useful is when the first operand does, or may (if it is a macro argument), contain a side effect. Then repeating the operand in the middle would perform the side effect twice. Omitting the middle operand uses the value already computed without the undesirable effects of recomputing it.

Related

Why do logical operators in C not evaluate the entire expression when it's not necessary to?

I was reading my textbook for my computer architecture class and I came across this statement.
A second important distinction between the logical operators '&&' and '||' versus their bit-level counterparts '&' and '|' is that the logical operators do not evaluate their second argument if the result of the expression can be determined by evaluating the first argument. Thus, for example, the expression a && 5/a will never cause a division by zero, and the expression p && *p++ will never cause the dereferencing of a null pointer. (Computer Systems: A Programmer's Perspective by Bryant and O'Hallaron, 3rd Edition, p. 57)
My question is why do logical operators in C behave like that? Using the author's example of a && 5/a, wouldn't C need to evaluate the whole expression because && requires both predicates to be true? Without loss of generality, my same question applies to his second example.
Short-circuiting is a performance enhancement that happens to be useful for other purposes.
You say "wouldn't C need to evaluate the whole expression because && requires both predicates to be true?" But think about it. If the left hand side of the && is false, does it matter what the right hand side evaluates to? false && true or false && false, the result is the same: false.
So when the left hand side of an && is determined to be false, or the left hand side of a || is determined to be true, the value on the right doesn't matter, and can be skipped. This makes the code faster by removing the need to evaluate a potentially expensive second test. Imagine if the right-hand side called a function that scanned a whole file for a given string? Wouldn't you want that test skipped if the first test meant you already knew the combined answer?
C decided to go beyond guaranteeing short-circuiting to guaranteeing order of evaluation because it means safety tests like the one you provide are possible. As long as the tests are idempotent, or the side-effects are intended to occur only when not short-circuited, this feature is desirable.
A typical example is a null-pointer check:
if(ptr != NULL && ptr->value) {
....
}
Without short-circuit-evaluation, this would cause an error when the null-pointer is dereferenced.
The program first checks the left part ptr != NULL. If this evaluates to false, it does not have to evaluate the second part, because it is already clear that the result will be false.
In the expression X && Y, if X is evaluated to false, then we know that X && Y will always be false, whatever is the value of Y. Therefore, there is no need to evaluate Y.
This trick is used in your example to avoid a division by 0. If a is evaluated to false (i.e. a == 0), then we do not evaluate 5/a.
It can also save a lot of time. For instance, when evaluating f() && g(), if the call to g() is expensive and if f() returns false, not evaluating g() is a nice feature.
wouldn't C need to evaluate the whole expression because && requires both predicates to be true?
Answer: No. Why work more when the answer is known "already"?
As per the definition of the logical AND operator, quoting C11, chapter §6.5.14
The && operator shall yield 1 if both of its operands compare unequal to 0; otherwise, it
yields 0.
Following that analogy, for an expression of the form a && b, in case a evaluates to FALSE, irrespective of the evaluated result of b, the result will be FALSE. No need to waste machine cycle checking the b and then returning FALSE, anyway.
Same goes for logical OR operator, too, in case the first argument evaluates to TRUE, the return value condition is already found, and no need to evaluate the second argument.
It's just the rule and is extremely useful. Perhaps that's why it's the rule. It means we can write clearer code. An alternative - using if statements would produce much more verbose code since you can't use if statements directly within expressions.
You already give one example. Another is something like if (a && b / a) to prevent integer division by zero, the behaviour of which is undefined in C. That's what the author is guarding themselves from in writing a && 5/a.
Very occasionally if you do always need both arguments evaluated (perhaps they call functions with side effects), you can always use & and |.

Why ternary operator does not support blocks?

Why the ternary operator does not have blocks? In other words, why the following code does not work and reports error for {} braces?
int main()
{
int i = 1;
(i==1)?{printf("Hello\n")}:{printf("World\n")};
return 0;
}
EDIT
Perhaps the question is misunderstood. It was: why blocks are not supported? Why only single expression?
Why this is not allowed to work?
int main()
{
int i = 1;
(i==1)?{printf("Hello\n");printf("World\n");}:{printf("Bye\n");printf("World\n");};
return 0;
}
One reason could be that ternary are often used for conditional assignment on left side and blocks will have no such returns or it will get confusing with multiple statements inside block.
To quote C11 standard, chapter §6.5.15, the syntax of the conditional operator is
conditional-expression:
logical-OR-expression
logical-OR-expression ? expression : conditional-expression
Where, the second and third operands are expression, not statements.
Just to elaborate,
One of the following shall hold for the second and third operands:
— both operands have arithmetic type;
— both operands have the same structure or union type;
— both operands have void type;
— both operands are pointers to qualified or unqualified versions of compatible types;
— one operand is a pointer and the other is a null pointer constant; or
— one operand is a pointer to an object type and the other is a pointer to a qualified or
unqualified version of void.
Edit:
To answer the question
Why only single expression?
again, quoting the standard,
....the result is the value of the second or third operand (whichever is evaluated), converted to the type described below.
Block of statements, will not give a value. The evaluation of an expression can.
The ternary operator consists of expressions. There is no such a kind of expression that uses braces.
You can write simply
( i == 1 ) ? printf("Hello\n") : printf("World\n");
It seems that the only case when braces can be present in an expression is the use of a compound literal. For example
struct A { int x; int y; } a = { 1, 2 };
a = a.x < a.y ? ( struct A ){ a.y, a.x } : ( struct A ){ ++a.x, --a.y };
As for this statement
(i==1)?{printf("Hello\n");printf("World\n");}:{printf("Bye\n");printf("World\n");};
then it can be rewritten the following way using the comma operator
i == 1 ? ( printf("Hello\n"), printf("World\n") ) : ( printf("Bye\n"), printf("World\n") );
Or even like
i == 1 ? printf("Hello\n"), printf("World\n") : ( printf("Bye\n"), printf("World\n") );
Shortly answering your question if you need a code block then use the if-else statement instead of the ternary operator. Though the if-else statement may not be used in expressions. On the other hand it is desirable for readability of the code that expressions would not be too compound.
As any operator the ternary operator is used in expressions and returns some evaluated value. For example as an expression it can be used as initializer or in assignments.
The ternary operator expects an expression for each part, and {...} is not an expression, but a statement.
To expand on your edit, the result of a ternary operator is an expression (but not an lvalue as you suggest), and statement blocks can't evaluate to a value.
For example, this doesn't make sense:
int x = (i==1)?{printf("Hello\n");printf("World\n");}:{printf("Bye\n");printf("World\n");};
But you could do this:
int x = (i==1)?(printf("Hello\n"), printf("World\n")):(printf("Bye\n"), printf("World\n"));
In which case, the comma operator would cause the last value in each subexpression to be returned.
Operators in C language can only be used in expressions. There's no such thing as "block" in an expression. In C language blocks are elements of higher-level syntactic structure. Blocks exists at the level of statements. Expression can be used in a statement. But statement cannot become an expression (or be used inside an expression).
Your particular example can be rewritten in terms of expressions
i == 1 ?
printf("Hello\n"), printf("World\n") :
printf("Bye\n"), printf("World\n");
without any need for {}.
(See Uses of C comma operator for extra information)
Yes only one expression is possible in ternary operators..You have to use if-else for multiple statements. Ternary operators takes one expression only in each slot
although you can call two different functions in ternary operator
#include <stdio.h>
void a(){
printf("Hello\n");
printf("Hi\n");
}
void b(){
printf("Hi\n");
printf("Hello\n");
}
int main()
{
int i = 1;
(i == 1) ? a() : b();
return 0;
}
The ternary operator is not meant to be used as a control structure, meaning it's not meant to control execution of statements. It's simply a way to choose which of two or more expressions will be evaluated.
As Sourav Ghosh has shown, the syntax of a conditional expression simply does not allow the operands of the ?: operator to be statements.
This is not allowed because it makes no sense. The ternary operator is meant to return a value. What would that be for {} blocks?
And then also there is another construct if () { } else { } that already serves the same purpose that you are trying to give to ? :. Doesn't this here look much nicer than the code that you posted?
int main(void)
{
int i = 1;
if (i==1) {
printf("Hello\n");
printf("World\n");
} else {
printf("Bye\n");
printf("World\n");
};
return 0;
}
As others have noted, GCC allows statements to be used syntactically as expressions, but such a feature is not part of the C standard. Historically, the reason for this likely had to do with the fact that statements are allowed to declare variables, and many systems use the same stack to hold local variables and parameters as is used to hold temporary values used in expression evaluation. Allowing new variables to come into existence within the execution of a statement while still keeping old variables in view would add some complexity to the compiler [note that when a function is called within a new expression, new variables are created for that function, but the old values will be "out of view" until the called function returns and the new variables are destroyed].
That having been said, other features of C such as variable-length arrays require far more complexity than would the ability to embed statements in expressions, so the arguments in favor of that design are no longer as compelling as they were in the 1970s. Unfortunately, the fact that something was once a compelling reason not to include a feature in a language may cause it to forevermore be perceived that way, even if design considerations for today's compilers are nothing like those of the 1970s.
Why ternary operator does not support blocks?
For roughly the same reason that a bicycle does not support wings.
If you have two blocks of statements, and you want to execute one or the other based on a condition, there's a perfectly good way to do that: the if/else statement.
And if you have two expressions, and you want to evaluate one or the other based on a condition, that's what the ternary or ?: operator is for.
But asking the ?: operator to execute one or the other of two blocks of statements is asking it to do something it's not meant for. Continuing the jokey analogies, it's like asking, why can't a hammer drive screws?
This distinction — that if/else is for blocks of statements, and ?: is for expressions — flows out of C's fundamental distinction between expressions and statements. They're only partially integrated in C: you can turn any expression into a statement by putting a semicolon after it, but you can not, in general, use a statement as an expression. Why not? Well, partly because the syntax of the language doesn't permit it, and partly because there's no universal definition of what the value of a statement (or a block of statements) is.
Now, you can certainly imagine a language in which every statement has a well-defined value. In such a language, if/else and ?: would probably be fully interchangeable. And, in fact, some C compilers do implement this full integration of expressions and statements, as an extension — but it's an extension, not Standard C.

Why is this version of logical AND in C not showing short-circuit behavior?

Yes, this is a homework question, but I've done my research and a fair amount of deep thought on the topic and can't figure this out. The question states that this piece of code does NOT exhibit short-circuit behavior and asks why. But it looks to me like it does exhibit short-circuit behavior, so can someone explain why it doesn't?
In C:
int sc_and(int a, int b) {
return a ? b : 0;
}
It looks to me that in the case that a is false, the program will not try to evaluate b at all, but I must be wrong. Why does the program even touch b in this case, when it doesn't have to?
This is a trick question. b is an input argument to the sc_and method, and so will always be evaluated. In other-words sc_and(a(), b()) will call a() and call b() (order not guaranteed), then call sc_and with the results of a(), b() which passes to a?b:0. It has nothing to do with the ternary operator itself, which would absolutely short-circuit.
UPDATE
With regards to why I called this a 'trick question': It's because of the lack of well-defined context for where to consider 'short circuiting' (at least as reproduced by the OP). Many persons, when given just a function definition, assume that the context of the question is asking about the body of the function; they often do not consider the function as an expression in and of itself. This is the 'trick' of the question; To remind you that in programming in general, but especially in languages like C-likes that often have many exceptions to rules, you can't do that. Example, if the question was asked as such:
Consider the following code. Will sc_and exibit short-circuit behavior when called from main:
int sc_and(int a, int b){
return a?b:0;
}
int a(){
cout<<"called a!"<<endl;
return 0;
}
int b(){
cout<<"called b!"<<endl;
return 1;
}
int main(char* argc, char** argv){
int x = sc_and(a(), b());
return 0;
}
It would be immediately clear that you're supposed to be thinking of sc_and as an operator in and of itself in your own domain-specific language, and evaluating if the call to sc_and exhibits short-circuit behavior like a regular && would. I would not consider that to be a trick question at all, because it's clear you're not supposed to focus on the ternary operator, and are instead supposed to focus on C/C++'s function-call mechanics (and, I would guess, lead nicely into a follow-up question to write an sc_and that does short-circuit, which would involve using a #define rather than a function).
Whether or not you call what the ternary operator itself does short-circuiting (or something else, like 'conditional evaluation') depends on your definition of short-circuiting, and you can read the various comments for thoughts on that. By mine it does, but it's not terribly relevant to the actual question or why I called it a 'trick'.
When the statement
bool x = a && b++; // a and b are of int type
executes, b++ will not be evaluated if the operand a evaluated to false (short circuit behavior). This means that the side-effect on b will not take place.
Now, look at the function:
bool and_fun(int a, int b)
{
return a && b;
}
and call this
bool x = and_fun(a, b++);
In this case, whether a is true or false, b++ will always be evaluated1 during function call and side effect on b will always take place.
Same is true for
int x = a ? b : 0; // Short circuit behavior
and
int sc_and (int a, int b) // No short circuit behavior.
{
return a ? b : 0;
}
1 Order of evaluation of function arguments are unspecified.
As already pointed out by others, no matter what gets pass into the function as the two arguments, it gets evaluated as it gets passed in. That is way before the tenary operation.
On the other hand, this
#define sc_and(a, b) \
((a) ?(b) :0)
would "short-circuit", as this macro does not imply a function call and with this no evaluation of a function's argument(s) is performed.
Edited to correct the errors noted in #cmasters comment.
In
int sc_and(int a, int b) {
return a ? b : 0;
}
... the returned expression does exhibit short-circuit evaluation, but the function call does not.
Try calling
sc_and (0, 1 / 0);
The function call evaluates 1 / 0, though it is never used, hence causing - probably - a divide by zero error.
Relevant excerpts from the (draft) ANSI C Standard are:
2.1.2.3 Program execution
...
In the abstract machine, all expressions are evaluated as specified by
the semantics. An actual implementation need not evaluate part of an
expression if it can deduce that its value is not used and that no
needed side effects are produced (including any caused by calling a
function or accessing a volatile object).
and
3.3.2.2 Function calls
....
Semantics
...
In preparing for the call to a function, the arguments are evaluated,
and each parameter is assigned the value of the corresponding
argument.
My guess is that each argument is evaluated as an expression, but that the argument list as a whole is not an expression, hence the non-SCE behaviour is mandatory.
As a splasher on the surface of the deep waters of the C standard, I'd appreciate a properly informed view on two aspects:
Does evaluating 1 / 0 produce undefined behaviour?
Is an argument list an expression? (I think not)
P.S.
Even you move to C++, and define sc_and as an inline function, you will not get SCE. If you define it as a C macro, as #alk does, you certainly will.
To clearly see ternary op short circuiting try changing the code slightly to use function pointers instead of integers:
int a() {
printf("I'm a() returning 0\n");
return 0;
}
int b() {
printf("And I'm b() returning 1 (not that it matters)\n");
return 1;
}
int sc_and(int (*a)(), int (*b)()) {
a() ? b() : 0;
}
int main() {
sc_and(a, b);
return 0;
}
And then compile it (even with almost NO optimization: -O0!). You will see b() is not executed if a() returns false.
% gcc -O0 tershort.c
% ./a.out
I'm a() returning 0
%
Here the generated assembly looks like:
call *%rdx <-- call a()
testl %eax, %eax <-- test result
je .L8 <-- skip if 0 (false)
movq -16(%rbp), %rdx
movl $0, %eax
call *%rdx <- calls b() only if not skipped
.L8:
So as others correctly pointed out the question trick is to make you focus on the ternary operator behaviour that DOES short circuit (call that 'conditional evaluation') instead of the parameter evaluation on call (call by value) that DOES NOT short circuit.
The C ternary operator can never short-circuit, because it only evaluates a single expression a (the condition), to determine a value given by expressions b and c, if any value might be returned.
The following code:
int ret = a ? b : c; // Here, b and c are expressions that return a value.
It's almost equivalent to the following code:
int ret;
if(a) {ret = b} else {ret = c}
The expression a may be formed by other operators like && or || that can short circuit because they may evaluate two expressions before returning a value, but that would not be considered as the ternary operator doing short-circuit but the operators used in the condition as it does in a regular if statement.
Update:
There is some debate about the ternary operator being a short-circuit operator. The argument says any operator that doesn't evaluate all it's operands does short-circuit according to #aruisdante in the comment below. If given this definition, then the ternary operator would be short-circuiting and in the case this is the original definition I agree. The problem is that the term "short-circuit" was originally used for a specific kind of operator that allowed this behavior and those are the logic/boolean operators, and the reason why are only those is what I'll try to explain.
Following the article Short-circuit Evaluation, the short-circuit evaluation is only referred to boolean operators implemented into the language in a way where knowing that the first operand will make the second irrelevant, this is, for the && operator being the first operand false, and for the || operator being the first operand true, the C11 spec also notes it in 6.5.13 Logical AND operator and 6.5.14 Logical OR operator.
This means that for the short-circuit behavior to be identified, you would expect to identify it in an operator that must evaluate all operands just like the boolean operators if the first operand doesn't make irrelevant the second. This is in line with what is written in another definition for the short-circuit in MathWorks under the "Logical short-circuiting" section, since short-circuiting comes from the logical operators.
As I've been trying to explain the C ternary operator, also called ternary if, only evaluates two of the operands, it evaluates the first one, and then evaluates a second one, either one of the two remaining depending on the value of the first one. It always does this, its not supposed to be evaluating all three in any situation, so there is no "short-circuit" in any case.
As always, if you see something is not right, please write a comment with an argument against this and not just a downvote, that just makes the SO experience worse, and I believe we can be a much better community that one that just downvotes answers one does not agree with.

Is it safe to use foo() && 0 in C?

Imagine i have the following piece of C-code where foo() produces a side effect and returns an integer:
if(bar) {
foo();
return 0;
}
Now, say I really like making my code compact, possibly at the reader's expense, and I change it into this:
if (bar)
return foo() && 0;
Can I be sure these two pieces of code will produce the same behavior, or would I risk the call to foo() not being executed due to possible compiler optimizations or something like that, thus not producing the desired side-effect?
NOTE: This is not a question about which piece of code is better, but whether the two pieces actually produce the same behavior in all cases. I think the majority (and I) can agree that the former piece of code should be used.
Yes, those two are the same. foo() will always be called (assuming bar is true).
The two forms you give are equivalent. The C11 standard (draft n1570) states,
6.5.13 Logical AND operator
...
Semantics
3 The && operator shall yield 1 if both of its operands compare unequal to 0;
otherwise, it yields 0. The result has type int.
4 Unlike the bitwise binary & operator, the && operator guarantees left-to-right
evaluation; if the second operand is evaluated, there is a sequence point between
the evaluations of the first and second operands. If the first operand compares
equal to 0, the second operand is not evaluated.
Similar language appeared in all C standards so far.
You should probably prefer using the comma operator here (return foo(), 0;) because:
It's shorter (one character versus two for the operator, and you can get away with removing the left space character when using a comma, for a total of two fewer characters).
It gives you more flexibility, as you can return non-scalar types (such as structs), and a wider range of integers than just 0 or 1.
It conveys the intent better: "Discard return value of foo() and return something else (0) instead".
Now if you do chance upon a compiler that deletes the call to foo(), then either the compiler managed to prove that foo() is a function with no visible side-effects, or more likely it has a serious bug and you should report it.
Why obfuscate your code in the latter?
Use the former.
Easier to read i.e. this is easier to understand
if(bar) {
foo();
return 0;
}
Or unless got a problem with job security

with conditional ?: expression, at what point do postfix operations occur?

For example, the difference between these two statements:
if ( ucNum++ >= 3 ) // ucNum incremented after comparing its value to 3, correct?
{
ucNum = 0;
}
vs.
ucNum++ >= 3 ? ucNum = 0 : 1; // does incrementing it happen somewhere in the middle of the inline?
Perhaps it is compiler specific. Where should it occur in the conditional expression?
The rules are that the condition is evaluated before choosing which alternative to evaluate. Since part of the evaluation is the ++, the increment will occur before the assignment (if the assignment occurs at all).
As #caf comments, there is a sequence point after the controlling expression. So, while (as David Thornley points out) the order of expression evaluations can be rearranged by the compiler (particularly side effect evaluations), the rearranging cannot cross sequence points.
Well, I've tested this practically (good thing printf returns int) with:
int ucNum = 4;
ucNum++ >= 3 ? printf("%d", ucNum) : 1;
Since the condition is true, it goes to the printf which prints 5. So definitely ucNum is incremented between the evaluation of the condition and the choosing of the return value.
You're looking for 6.5.15/4 in the C Standard. The first expression is completely evaluated, including side effects, before the selected one of the other two. This is not compiler-dependent, except in the sense that some compilers may be broken.

Resources