Checking order of operations in C 'if' statement

Checking order of operations in C 'if' statement - c

The following snippet of C code (where a and b are both type double) is what my question is about:
if(1.0-a < b && b <= 1.0)
Based on the order of operations shown in Wikipedia I understand this as evaluating the same as the following code snippet with parentheses:
if( ( (1.0-a) < b ) && ( b <= 1.0) )
which is what I want. I just want to double check my understanding that the two code snippets are indeed equivalent by the order of operations in C.
Note: obviously I could just use the second code snippet and make explicit what I want if() to evaluate; I ask because I've used the first snippet in my code for a while and I want to make sure my previous results from the code are okay.

Quick answer: yes, it is equivalent.
This means that the result of both code snippets is the same; the meaning is the same, but be careful when you talk about order of operations. It looks to me like your question here is about precedence and associativity. The latter tells you what an expression means, not the order of evaluation of its operands. To learn about order of evaluation, read about sequence points: Undefined behavior and sequence points

You ask about "order of operations", but I don't think that's what you really want to know.
The phrase "order of operations" refers to the time order in which operations are performed. In most cases, the order in which operations are performed within an expression is unspecified. The && operator is one of the few exceptions to this; it guarantees that its left operand is evaluated before its right operand (and the right operand might not be evaluated at all).
The parentheses you added can affect which operands are associated with which operators -- and yes, the two expressions
1.0-a < b && b <= 1.0
and
( (1.0-a) < b ) && ( b <= 1.0)
are equivalent.
Parentheses can be used to override operator precedence. They do not generally affect the order in which the operators are evaluated.
An example: this:
x + y * z
is equivalent to this:
x + (y * z)
because multiplication has a higher precedence than addition. But the three operands x, y, and z may be evaluated in any of the 6 possible orders:
x, y, z
x, z, y
y, x, z
y, z, x
z, x, y
z, y, x
The order makes no difference in this case (unless some of them are volatile), but it can matter if they're subexpressions with side effects.

Related

C ternary operator, can I omit one part?

I wanted to know if there's a way to omit second or third part of the ternary operator?
I already read this and similar ones but they didn't help.
What I specifically want is something like:
x == y ? x*=2;
however this gives me error as gcc expects another expression also. So are:
x == y ? x *=2 : continue;
x == y ?: x /=2;
What can I do in these situations except:
if(x == y) do_something;
Edit for further clarification:
As my question seems to be confusing and got all kinds of comments/answers my point was when thinking logically, an else is required after if , so is the default statement in a switch however, neither are mandatory. I was asking if it's the case with ?: also and if so, how.

I wanted to know if there's a way to omit second or third part of the ternary operator?
No, not in standard C.
You can substitute expressions or statements that do not use the conditional operator at all, but in standard C, the conditional operator requires three operands, just like the division operator (/) requires two. You cannot omit any.
Nor is it clear why you want to do. The primary thing that distinguishes the conditional operator from an if [/ else] statement is that expressions using the conditional operator are evaluated to produce values. If you're not interested in that value then using a conditional expression instead of a conditional statement is poor style. A standard if statement is much clearer, and clarity is king. This is a consideration even when you do want the value.
What can I do in these situations except:
if(x == y) do_something;
You can go have a coffee and hope the mood passes.
But if it doesn't, then the logical operators && and || have short-circuiting behavior that might suit, as #EricPostpischil already observed:
a && b is an expression analogous to if (a) b;. It evaluates a, after which there is a sequence point. If a was truthy then it evaluates b and that is the result of the expression; otherwise it does not evaluate b and the value of a is the value of the expression. That is the C version of the hypothetical a ? b : (nothing), and why C does not need the latter.
Similarly, a || b is an expression analogous to if (!a) b;. b is evaluated and yields the result of the expression if and only if a is falsey. That is the C version of the hypothetical a ? (nothing) : b.
But here again, it is poor C style to use && and || expressions exclusively for their side effects. If you don't care about the result of the operation, then use an if statement.
Or perhaps poor style is the point? If you're shooting for an entry in the International Obfuscated C Code Contest then abusing operators is par for the course. In that case, you could consider rewriting your expressions to use the ternary operator after all. For example,
x == y ? x *=2 : continue;
could be written as x *= ((x == y) ? 2 : 1), provided that you weren't actually trying to get loop-cycling behavior out of that continue. And
x == y ?: x /=2;
could be rewritten similarly. Though if you were actually looking toward IOCCC, then there are better obfuscation options available.

For the purpose asked about in this question, in which the result value of the conditional operator would not be used:
For a ? b : c without b you can use a && b, which will evaluate b if and only if a is true.
For a ? b : c without c you can use a || c, which will evaluate c if and only if a is false.
These expressions will have different values than a ? b : c, but that does not matter when the value is not used.
Without some exceptional circumstance to justify this, most experienced programmers would consider it bad practice.
GCC has an extension that uses the first operand for a missing second operand without evaluating it a second time. E.g. f(x) ? : y is equivalent to f(x) ? f(x) : y except that f is only called once.

Similar to the 'hyphen-ish' character of "-1" being called "unary minus", "?:" is called "trenary" because it requires 3 parts: the condition, the "true" case statement and the "false" case statement. To use "?:" you must supply 3 "terms".
Answering the question in the title, no, you cannot omit one part.
The following responds to "What can I do in these situations except:"
Given that your two examples show an interest in performing (or not) a mathematical operation on the variable 'x', here is a "branchless" approach toward that (limited) objective. ("Branchless" coding techniques seek to reduce the impact of "branch prediction misses", an efficiency consideration to reduce processing time.)
Note: the for() loop is only a "test harness" that presents 3 different values for 'y' to be compared to the value of 'x'. The variable 'n' makes more obvious your OP constant '2'. Further, as you are aware, performing multiplication OR division are two completely different operations. This example shows multiplication only. (Replace the '*' with '/' for division with the standard caveat regarding "division by zero" being undefined.) Depending on the probability of "cache misses" and "branch prediction" in modern CPUs, this seemingly complex calculation may require much less processing time than a 'true/false branch' that may bypass processing.
int n = 2; // multiplier
for( int y = 4; y <= 6; y++ ) { // three values for 'y'
int xr = 5; // one value for 'xr'egular
int xb = 5; // same value for 'xb'ranch
(xr == y) ? xr *= n : 1; // to be legitimate C
// when x == y the rhs becomes (n-1)*(1)+1 which equals n
// when x != y the rhs becomes (n-1)*(0)+1 which equals 1 (identity)
// Notice the rhs includes a conditional
// and that the entire statement WILL be evaluated, never bypassed.
xb *= ((n-1)*(xb==y))+1;
printf( "trenaryX = %2d, branchlessX = %2d\n", xr, xb );
}
Output
trenaryX = 5, branchlessX = 5
trenaryX = 10, branchlessX = 10
trenaryX = 5, branchlessX = 5
I hope this makes clear that "trenary" means "3 part" and that this digression into "branchless coding" may have broadened your horizons.

You can use the fact that the result of comparison operators is an int with value 0 or 1...
x == y ? x*=2;
x *= (x == y) + 1; // multiply by either 1 or 2
But a plain if is way more readable
if (x == y) x *= 2;

x == y ? x*=2 : 1;
The syntax requires all three parts... But, if you write code like this, you will lose popularity at the office...
Repeating for those who might have missed it: The syntax requires all three parts.

Actually, you shouldn't do this because as #user229044 commented, "if (x==y) do_something; is exactly what you should do here, not abuse the ternary operator to produce surprising, difficult-to-read code that can only cause problems down the line. You say "I need to know if that's possible", but why? This is exactly what if is for."
As in ternary operator without else in C, you can just have the third/second part of the ternary operator set x to itself, for example, you can just do:
x = (x == y ? x *= 2 : x);
or
x == (y ? x : x /= 2);

Does this C code result in Undefined Behavior?

I know that:
int b = 1, c = 2, d = 3, e = 4;
printf("%d %d %d", ++b, b, b++);
results in undefined behavior. Since
Modifying any object more than once between two sequence points is UB.
Undefined behavior and sequence points
But I don't know if:
int b = 1, c = 2, d = 3, e = 4;
printf("%d", b++ + ++c - --d - e--);
is also UB?
What I think is that increment/decrement operators will evalute first because of the precedence, between them right to left since the associativity . Then arithmetic operators will be evaluated left to right.
Which will just be
(b) + (c + 1) - (d - 1) - (e)
that is, 1 + (2 + 1) - (3 - 1) - (4)
= (2 - 4)
= -2
Is it right?

But I don't know if: ... is also UB?
It is not, but your reasoning about why is fuzzy.
What I think is that increment/decrement operators will evaluate first because of the precedence, between them right to left since the associativity . Then arithmetic operators will be evaluated left to right.
Precedence determines how the result is calculated. It doesn't say anything about the ordering of the side-effects.
There is no equivalent of precedence telling you when the side effects (the stored value of b has been incremented, the stored value of e has been decremented) are observable during the statement. All you know is that the variables have taken their new values before the next statement (ie, by the ;).
So, the reason this is well-defined is that it does not depend on those side-effects.
I deliberately hand-waved the language to avoid getting bogged down, but I should probably clarify:
"during the statement" really means "before the next sequence point"
"before the next statement (... ;)" really means "at the next sequence point"
See Order of evaluation:
There is a sequence point after the evaluation of all function arguments and of the function designator, and before the actual function call.
So really the side-effects are committed before the call to printf, so earlier than the ; at the end of the statement.

There is a gigantic difference between the expressions
b++ + ++c - --d - e--
(which is fine), and
x++ + ++x - --x - x--
(which is rampantly undefined).
It's not using ++ or -- that makes an expression undefined. It's not even using ++ or -- twice in the same expression. No, the problem is when you use ++ or -- to modify a variable inside an expression, and you also try to use the value of that same variable elsewhere in the same expression, and without an intervening sequence point.
Consider the simpler expression
++z + z;
Now, obviously the subexpression ++z will increment z. So the question is, does the + z part use the old or the new value of z? And the answer is that there is no answer, which is why this expression is undefined.
Remember, expressions like ++z do not just mean, "take z's value and add 1". They mean, "take z's value and add 1, and store the result back into z". These expressions have side effects. And the side effects are at the root of the undefinedness issue.

Why does a=(b++) have the same behavior as a=b++?

I am writing a small test app in C with GCC 4.8.4 pre-installed on my Ubuntu 14.04. And I got confused for the fact that the expression a=(b++); behaves in the same way as a=b++; does. The following simple code is used:
#include <stdint.h>
#include <stdio.h>
int main(int argc, char* argv[]){
uint8_t a1, a2, b1=10, b2=10;
a1=(b1++);
a2=b2++;
printf("a1=%u, a2=%u, b1=%u, b2=%u.\n", a1, a2, b1, b2);
}
The result after gcc compilation is a1=a2=10, while b1=b2=11. However, I expected the parentheses to have b1 incremented before its value is assigned to a1.
Namely, a1 should be 11 while a2 equals 10.
Does anyone get an idea about this issue?

However, I expected the parentheses to have b1 incremented before its value is assigned to a1
You should not have expected that: placing parentheses around an increment expression does not alter the application of its side effects.
Side effects (in this case, it means writing 11 into b1) get applied some time after retrieving the current value of b1. This could happen before or after the full assignment expression is evaluated completely. That is why a post-increment will remain a post-increment, with or without parentheses around it. If you wanted a pre-increment, place ++ before the variable:
a1 = ++b1;

Quoting from the C99:6.5.2.4:
The result of the postfix ++ operator is the value of the operand.
After the result is obtained, the value of the operand is incremented.
(That is, the value 1 of the appropriate type is added to it.) See the
discussions of additive operators and compound assignment for
information on constraints, types, and conversions and the effects of
operations on pointers. The side effect of updating the stored value
of the operand shall occur between the previous and the next sequence
point.
You can look up the C99: annex C to understand what the valid sequence points are.
In your question, just adding a parentheses doesn't change the sequence points, only the ; character does that.
Or in other words, you can view it like there's a temporary copy of b and the side-effect is original b incremented. But, until a sequence point is reached, all evaluation is done on the temporary copy of b. The temporary copy of b is then discarded, the side effect i.e. increment operation is committed to the storage,when a sequence point is reached.

Parentheses can be tricky to think about. But they do not mean, "make sure that everything inside happens first".
Suppose we have
a = b + c * d;
The higher precedence of multiplication over addition tells us that the compiler will arrange to multiply c by d, and then add the result to b. If we want the other interpretation, we can use parentheses:
a = (b + c) * d;
But suppose that we have some function calls thrown into the mix. That is, suppose we write
a = x() + y() * z();
Now, while it's clear that the return value of y() will be multiplied by the return value of z(), can we say anything about the order that x(), y(), and z() will be called in? The answer is, no, we absolutely cannot! If you're at all unsure, I invite you to try it, using x, y, and z functions like this:
int x() { printf("this is x()\n"); return 2; }
int y() { printf("this is y()\n"); return 3; }
int z() { printf("this is z()\n"); return 4; }
The first time I tried this, using the compiler in front of me, I discovered that function x() was called first, even though its result is needed last. When I changed the calling code to
a = (x() + y()) * z();
the order of the calls to x, y, and z stayed exactly the same, the compiler just arranged to combine their results differently.
Finally, it's important to realize that expressions like i++ do two things: they take i's value and add 1 to it, and then they store the new value back into i. But the store back into i doesn't necessarily happen right away, it can happen later. And the question of "when exactly does the store back into i happen?" is sort of like the question of "when does function x get called?". You can't really tell, it's up to the compiler, it usually doesn't matter, it will differ from compiler to compiler, if you really care, you're going to have to do something else to force the order.
And in any case, remember that the definition of i++ is that it gives the old value of i out to the surrounding expression. That's a pretty absolute rule, and it can not be changed just by adding some parentheses! That's not what parentheses do.
Let's go back to the previous example involving functions x, y, and z. I noticed that function x was called first. Suppose I didn't want that, suppose I wanted functions y and z to be called first. Could I achieve that by writing
x = z() + ((y() * z())?
I could write that, but it doesn't change anything. Remember, the parentheses don't mean "do everything inside first". They do cause the multiplication to happen before the addition, but the compiler was already going to do it that way anyway, based on the higher precedence of multiplication over addition.
Up above I said, "if you really care, you're going to have to do something else to force the order". What you generally have to do is use some temporary variables and some extra statements. (The technical term is "insert some sequence points.") For example, to cause y and z to get called first, I could write
c = y();
d = z();
b = x();
a = b + c * d;
In your case, if you wanted to make sure that the new value of b got assigned to a, you could write
c = b++;
a = b;
But of course that's silly -- if all you want to do is increment b and have its new value assigned to a, that's what prefix ++ is for:
a = ++b;

Your expectations are completely unfounded.
Parentheses have no direct effect on the order of execution. They don't introduce sequence points into the expression and thus they don't force any side-effects to materialize earlier than they would've materialized without parentheses.
Moreover, by definition, post-increment expression b++ evaluates to the original value of b. This requirement will remain in place regardless of how many pair of parentheses you add around b++. Even if parentheses somehow "forced" an instant increment, the language would still require (((b++))) to evaluate to the old value of b, meaning that a would still be guaranteed to receive the non-incremented value of b.
Parentheses only affects the syntactic grouping between operators and their operands. For example, in your original expression a = b++ one might immediately ask whether the ++ apples to b alone or to the result of a = b. In your case, by adding the parentheses you simply explicitly forced the ++ operator to apply to (to group with) b operand. However, according to the language syntax (and the operator precedence and associativity derived from it), ++ already applies to b, i.e. unary ++ has higher precedence than binary =. Your parentheses did not change anything, it only reiterated the grouping that was already there implicitly. Hence no change in the behavior.

Parentheses are entirely syntactic. They just group expressions and they are useful if you want to override the precedence or associativity of operators. For example, if you use parentheses here:
a = 2*(b+1);
you mean that the result of b+1 should be doubled, whereas if you omit the parentheses:
a = 2*b+1;
you mean that just b should be doubled and then the result should be incremented. The two syntax trees for these assignments are:
= =
/ \ / \
a * a +
/ \ / \
2 + * 1
/ \ / \
b 1 2 b
a = 2*(b+1); a = 2*b+1;
By using parentheses, you can therefore change the syntax tree that corresponds to your program and (of course) different syntax may correspond to different semantics.
On the other hand, in your program:
a1 = (b1++);
a2 = b2++;
parentheses are redundant because the assignment operator has lower precedence than the postfix increment (++). The two assignments are equivalent; in both cases, the corresponding syntax tree is the following:
=
/ \
a ++ (postfix)
|
b
Now that we're done with the syntax, let's go to semantics. This statement means: evaluate b++ and assign the result to a. Evaluating b++ returns the current value of b (which is 10 in your program) and, as a side effect, increments b (which now becomes 11). The returned value (that is, 10) is assigned to a. This is what you observe, and this is the correct behaviour.

However, I expected the parentheses to have b1 incremented before its value is assigned to a1.
You aren't assigning b1 to a1: you're assigning the result of the postincrement expression.
Consider the following program, which prints the value of b when executing assignment:
#include <iostream>
using namespace std;
int b;
struct verbose
{
int x;
void operator=(int y) {
cout << "b is " << b << " when operator= is executed" << endl;
x = y;
}
};
int main() {
// your code goes here
verbose a;
b = 10;
a = b++;
cout << "a is " << a.x << endl;
return 0;
}
I suspect this is undefined behavior, but nonetheless when using ideone.com I get the output shown below
b is 11 when operator= is executed
a is 10

OK, in a nutshell: b++ is a unary expression, and parentheses around it won't ever take influence on precedence of arithmetic operations, because the ++ increment operator has one of the highest (if not the highest) precedence in C. Whilst in a * (b + c), the (b + c) is a binary expression (not to be confused with binary numbering system!) because of a variable b and its addend c. So it can easily be remembered like this: parentheses put around binary, ternary, quaternary...+INF expressions will almost always have influence on precedence(*); parentheses around unary ones NEVER will - because these are "strong enough" to "withstand" grouping by parentheses.
(*)As usual, there are some exceptions to the rule, if only a handful: e. g. -> (to access members of pointers on structures) has a very strong binding despite being a binary operator. However, C beginners are very likely to take quite awhile until they can write a -> in their code, as they will need an advanced understanding of both pointers and structures.

The parentheses will not change the post-increment behaviour itself.
a1=(b1++); //b1=10
It equals to,
uint8_t mid_value = b1++; //10
a1 = (mid_value); //10

Placing ++ at the end of a statement (known as post-increment), means that the increment is to be done after the statement.
Even enclosing the variable in parenthesis doesn't change the fact that it will be incremented after the statement is done.
From learn.geekinterview.com:
In the postfix form, the increment or decrement takes place after the value is used in expression evaluation.
In prefix increment or decrement operation the increment or decrement takes place before the value is used in expression evaluation.
That's why a = (b++) and a = b++ are the same in terms of behavior.
In your case, if you want to increment b first, you should use pre-increment, ++b instead of b++ or (b++).
Change
a1 = (b1++);
to
a1 = ++b1; // b will be incremented before it is assigned to a.

To make it short:
b++ is incremented after the statement is done
But even after that, the result of b++ is put to a.
Because of that parentheses do not change the value here.

Implications of operator precedence in C

I understand that this topic has come up umpteen times but I request a moment.
I have tried understanding this many times, also in context of order of evaluation. I was looking for some explicit examples to understand op. precedence and I found one here: http://docs.roxen.com/pike/7.0/tutorial/expressions/operator_tables.xml What I would like to know is if the examples given there (I have cut-pasted them below) are correct.
1+2*2 => 1+(2*2)
1+2*2*4 => 1+((2*2)*4)
(1+2)*2*4 => ((1+2)*2)*4
1+4,c=2|3+5 => (1+4),(c=(2|(3+5)))
1 + 5&4 == 3 => (1 + 5) & (4 == 3)
c=1,99 => (c=1),99
!a++ + ~f() => (!(a++)) + (~(f()))
s == "klas" || i < 9 => (s == "klas") || (i < 9)
r = s == "sten" => r = (s == "sten")
For instance, does 1+2*2*4 is really 1+((2*2)*4) or could as well have been, 1+(2*(2*4)) according to C specification. Any help or further reference to examples would be useful. Thanks again.

Although those examples come from a different language, I think they are the same as operator precedence in C. In general, you'd be better off using a reference for the C language, such as the C standard, or a summary such as the one in Wikipedia.
However, I don't believe that is actually what you are asking. Operator precedence has no implications for order of evaluation. All operator precedence does is show you how to parenthesize the expression. A C compiler is allowed to evaluate the operations in just about any order it wishes to. It is also allowed to use algebraic identities if it is provable that they will have the same result for all valid inputs (this is not usually the case for floating point calculations, but it is usually true for unsigned integer calculations).
The only cases where the compiler is required to produce code with a specific evaluation order are:
Short-circuit boolean operators && and ||: the left argument must be evaluated first, and in some cases the right argument may not be evaluated;
The so-called ternary operator ?:: the left argument (before the ?) must be evaluated first; subsequently, exactly one of the other two operators will be evaluated. Note that this operator groups to the right, demonstrating that there is no relationship between grouping and evaluation order. That is, pred_1 ? action_1() : pred_2 ? action_2() : pred_3 ? action_3() is the same as pred_1 ? action_1() : (pred_2 ? action_2() : pred_3 ? action_3()), but it's pred_1 which must be evaluated first.
The comma operator ,: the left argument must be evaluated first. This is not the same as the use of the comma in function calls.
Function arguments must be evaluated before the function is called, although the order of evaluation of the arguments is not specified, and neither is the order of evaluation of the expression which produces the function.
The last phrase refers to examples such as this:
// This code has Undefined Behaviour. DO NOT USE
typedef void(*takes_int_returns_void)(int);
takes_int_returns_void fvector[3] = {...}
//...
//...
(*fvector[i++])(i);
Here, a compiler may choose to increment i before or after it evaluates the argument to the function (or other less pleasant possibilities), so you don't actually know what value the function will be called with.
In the case of 1+2*2*4, the compiler must generate code which will produce 17. How it does that is completely up to the compiler. Furthermore, if all x, y and z are all unsigned integers, a compiler may compile 1 + x*y*z with any order of multiplications it wants to, even reordering to y*(x*z).

Most operators have precedence from left to right.This will give a detailed idea about operator precedence :
Click here!

Binary operators, other than assignment operators, go from left to right when they are of equal precedence, so 1 + 2 * 2 * 4 is equivalent to 1 + ((2 * 2) * 4). Obviously in this particular case 1 + (2 * (2 * 4)) gives the same answer, but it won't always. For instance, 1 + 2 / 2.0 * 4 evaluates to 1 + ((2 / 2.0) * 4) == 5.0 and not to 1 + (2 / (2.0 * 4)) == 1.25.
Order of evaluation is a completely different thing from operator precedence. For one thing, operator precedence is always well-defined, order of evaluation sometimes is not (e.g. the order in which function arguments are evaluated).

This is a perfect tutorial about operator precedence and order of evaluation. Enjoy!

c, bitwise, logic expression

int x = 0;
x^=x || x++ || ++x;
and the answer for x at last is 3.
How to analysis this expression?
little confused about this.
Thanks a lot.

This is undefined behaviour. The result could be anything. This is because there is no sequence point between the ++x and the x ^=, so there is no guarantee which will be "done" first.

It's undefined behaviour - so you can get any answer you'd like.

As others have already noted, this is undefined behavior. But why?
When programming in C, there is an inherent difference between a statement and an expression. Expression evaluation should give you the same observable results in any case (e.g., (x + 5) + 2 is the same as x + (5 + 2)). Statements, on the other hand, are used for sequencing of side-effects, that is to say, will generally result in, say, writing to some memory location.
Considering the above, expressions are safe to "nest" into statements, whereas nesting statements into expressions isn't. By "safe" I mean "no surprising results".
In your example, we have
x^=x || x++ || ++x;
Which order should the evaluation go about? Since || operates on expressions, it shouldn't matter whether we go (x || x++) || ++x or x || (x++ || ++x) or even ++x || (x || x++). However, since x++ and ++x are statements (even though C allows them to be used as expressions), we cannot proceed by algebraic reasoning. So, you will need to express the order of operations explicitly, by writing multiple statements.

XOR 0 with 0 is 0. Then ++ twice is equal to 2. Nevertheless, as pointed in other answers, there's no sequence point. So the output could be anything.