c, bitwise, logic expression - c

int x = 0;
x^=x || x++ || ++x;
and the answer for x at last is 3.
How to analysis this expression?
little confused about this.
Thanks a lot.

This is undefined behaviour. The result could be anything. This is because there is no sequence point between the ++x and the x ^=, so there is no guarantee which will be "done" first.

It's undefined behaviour - so you can get any answer you'd like.

As others have already noted, this is undefined behavior. But why?
When programming in C, there is an inherent difference between a statement and an expression. Expression evaluation should give you the same observable results in any case (e.g., (x + 5) + 2 is the same as x + (5 + 2)). Statements, on the other hand, are used for sequencing of side-effects, that is to say, will generally result in, say, writing to some memory location.
Considering the above, expressions are safe to "nest" into statements, whereas nesting statements into expressions isn't. By "safe" I mean "no surprising results".
In your example, we have
x^=x || x++ || ++x;
Which order should the evaluation go about? Since || operates on expressions, it shouldn't matter whether we go (x || x++) || ++x or x || (x++ || ++x) or even ++x || (x || x++). However, since x++ and ++x are statements (even though C allows them to be used as expressions), we cannot proceed by algebraic reasoning. So, you will need to express the order of operations explicitly, by writing multiple statements.

XOR 0 with 0 is 0. Then ++ twice is equal to 2. Nevertheless, as pointed in other answers, there's no sequence point. So the output could be anything.

Related

C ternary operator, can I omit one part?

I wanted to know if there's a way to omit second or third part of the ternary operator?
I already read this and similar ones but they didn't help.
What I specifically want is something like:
x == y ? x*=2;
however this gives me error as gcc expects another expression also. So are:
x == y ? x *=2 : continue;
x == y ?: x /=2;
What can I do in these situations except:
if(x == y) do_something;
Edit for further clarification:
As my question seems to be confusing and got all kinds of comments/answers my point was when thinking logically, an else is required after if , so is the default statement in a switch however, neither are mandatory. I was asking if it's the case with ?: also and if so, how.
I wanted to know if there's a way to omit second or third part of the ternary operator?
No, not in standard C.
You can substitute expressions or statements that do not use the conditional operator at all, but in standard C, the conditional operator requires three operands, just like the division operator (/) requires two. You cannot omit any.
Nor is it clear why you want to do. The primary thing that distinguishes the conditional operator from an if [/ else] statement is that expressions using the conditional operator are evaluated to produce values. If you're not interested in that value then using a conditional expression instead of a conditional statement is poor style. A standard if statement is much clearer, and clarity is king. This is a consideration even when you do want the value.
What can I do in these situations except:
if(x == y) do_something;
You can go have a coffee and hope the mood passes.
But if it doesn't, then the logical operators && and || have short-circuiting behavior that might suit, as #EricPostpischil already observed:
a && b is an expression analogous to if (a) b;. It evaluates a, after which there is a sequence point. If a was truthy then it evaluates b and that is the result of the expression; otherwise it does not evaluate b and the value of a is the value of the expression. That is the C version of the hypothetical a ? b : (nothing), and why C does not need the latter.
Similarly, a || b is an expression analogous to if (!a) b;. b is evaluated and yields the result of the expression if and only if a is falsey. That is the C version of the hypothetical a ? (nothing) : b.
But here again, it is poor C style to use && and || expressions exclusively for their side effects. If you don't care about the result of the operation, then use an if statement.
Or perhaps poor style is the point? If you're shooting for an entry in the International Obfuscated C Code Contest then abusing operators is par for the course. In that case, you could consider rewriting your expressions to use the ternary operator after all. For example,
x == y ? x *=2 : continue;
could be written as x *= ((x == y) ? 2 : 1), provided that you weren't actually trying to get loop-cycling behavior out of that continue. And
x == y ?: x /=2;
could be rewritten similarly. Though if you were actually looking toward IOCCC, then there are better obfuscation options available.
For the purpose asked about in this question, in which the result value of the conditional operator would not be used:
For a ? b : c without b you can use a && b, which will evaluate b if and only if a is true.
For a ? b : c without c you can use a || c, which will evaluate c if and only if a is false.
These expressions will have different values than a ? b : c, but that does not matter when the value is not used.
Without some exceptional circumstance to justify this, most experienced programmers would consider it bad practice.
GCC has an extension that uses the first operand for a missing second operand without evaluating it a second time. E.g. f(x) ? : y is equivalent to f(x) ? f(x) : y except that f is only called once.
Similar to the 'hyphen-ish' character of "-1" being called "unary minus", "?:" is called "trenary" because it requires 3 parts: the condition, the "true" case statement and the "false" case statement. To use "?:" you must supply 3 "terms".
Answering the question in the title, no, you cannot omit one part.
The following responds to "What can I do in these situations except:"
Given that your two examples show an interest in performing (or not) a mathematical operation on the variable 'x', here is a "branchless" approach toward that (limited) objective. ("Branchless" coding techniques seek to reduce the impact of "branch prediction misses", an efficiency consideration to reduce processing time.)
Note: the for() loop is only a "test harness" that presents 3 different values for 'y' to be compared to the value of 'x'. The variable 'n' makes more obvious your OP constant '2'. Further, as you are aware, performing multiplication OR division are two completely different operations. This example shows multiplication only. (Replace the '*' with '/' for division with the standard caveat regarding "division by zero" being undefined.) Depending on the probability of "cache misses" and "branch prediction" in modern CPUs, this seemingly complex calculation may require much less processing time than a 'true/false branch' that may bypass processing.
int n = 2; // multiplier
for( int y = 4; y <= 6; y++ ) { // three values for 'y'
int xr = 5; // one value for 'xr'egular
int xb = 5; // same value for 'xb'ranch
(xr == y) ? xr *= n : 1; // to be legitimate C
// when x == y the rhs becomes (n-1)*(1)+1 which equals n
// when x != y the rhs becomes (n-1)*(0)+1 which equals 1 (identity)
// Notice the rhs includes a conditional
// and that the entire statement WILL be evaluated, never bypassed.
xb *= ((n-1)*(xb==y))+1;
printf( "trenaryX = %2d, branchlessX = %2d\n", xr, xb );
}
Output
trenaryX = 5, branchlessX = 5
trenaryX = 10, branchlessX = 10
trenaryX = 5, branchlessX = 5
I hope this makes clear that "trenary" means "3 part" and that this digression into "branchless coding" may have broadened your horizons.
You can use the fact that the result of comparison operators is an int with value 0 or 1...
x == y ? x*=2;
x *= (x == y) + 1; // multiply by either 1 or 2
But a plain if is way more readable
if (x == y) x *= 2;
x == y ? x*=2 : 1;
The syntax requires all three parts... But, if you write code like this, you will lose popularity at the office...
Repeating for those who might have missed it: The syntax requires all three parts.
Actually, you shouldn't do this because as #user229044 commented, "if (x==y) do_something; is exactly what you should do here, not abuse the ternary operator to produce surprising, difficult-to-read code that can only cause problems down the line. You say "I need to know if that's possible", but why? This is exactly what if is for."
As in ternary operator without else in C, you can just have the third/second part of the ternary operator set x to itself, for example, you can just do:
x = (x == y ? x *= 2 : x);
or
x == (y ? x : x /= 2);

Does this C code result in Undefined Behavior?

I know that:
int b = 1, c = 2, d = 3, e = 4;
printf("%d %d %d", ++b, b, b++);
results in undefined behavior. Since
Modifying any object more than once between two sequence points is UB.
Undefined behavior and sequence points
But I don't know if:
int b = 1, c = 2, d = 3, e = 4;
printf("%d", b++ + ++c - --d - e--);
is also UB?
What I think is that increment/decrement operators will evalute first because of the precedence, between them right to left since the associativity . Then arithmetic operators will be evaluated left to right.
Which will just be
(b) + (c + 1) - (d - 1) - (e)
that is, 1 + (2 + 1) - (3 - 1) - (4)
= (2 - 4)
= -2
Is it right?
But I don't know if: ... is also UB?
It is not, but your reasoning about why is fuzzy.
What I think is that increment/decrement operators will evaluate first because of the precedence, between them right to left since the associativity . Then arithmetic operators will be evaluated left to right.
Precedence determines how the result is calculated. It doesn't say anything about the ordering of the side-effects.
There is no equivalent of precedence telling you when the side effects (the stored value of b has been incremented, the stored value of e has been decremented) are observable during the statement. All you know is that the variables have taken their new values before the next statement (ie, by the ;).
So, the reason this is well-defined is that it does not depend on those side-effects.
I deliberately hand-waved the language to avoid getting bogged down, but I should probably clarify:
"during the statement" really means "before the next sequence point"
"before the next statement (... ;)" really means "at the next sequence point"
See Order of evaluation:
There is a sequence point after the evaluation of all function arguments and of the function designator, and before the actual function call.
So really the side-effects are committed before the call to printf, so earlier than the ; at the end of the statement.
There is a gigantic difference between the expressions
b++ + ++c - --d - e--
(which is fine), and
x++ + ++x - --x - x--
(which is rampantly undefined).
It's not using ++ or -- that makes an expression undefined. It's not even using ++ or -- twice in the same expression. No, the problem is when you use ++ or -- to modify a variable inside an expression, and you also try to use the value of that same variable elsewhere in the same expression, and without an intervening sequence point.
Consider the simpler expression
++z + z;
Now, obviously the subexpression ++z will increment z. So the question is, does the + z part use the old or the new value of z? And the answer is that there is no answer, which is why this expression is undefined.
Remember, expressions like ++z do not just mean, "take z's value and add 1". They mean, "take z's value and add 1, and store the result back into z". These expressions have side effects. And the side effects are at the root of the undefinedness issue.

C, pre and post increment, different answers in different programs [duplicate]

This question already has answers here:
Why are these constructs using pre and post-increment undefined behavior?
(14 answers)
Closed 6 years ago.
testing some code when studying C language,
#include <stdio.h>
#include <math.h>
#define hypotenusa(x, y) sqrt((x) * (x) + (y) * (y))
int main(void) {
int a, x;
x = 2;
a = hypotenusa(++x, ++x);
printf("%d\n", a);
}
And I am getting the answer
6 in one program(dosbox gcc compiler)
7 in codelight gcc compiler and
8 on codeChef online compiler
can anyone explain this behaviour?
my logic says it should be 6 (sqrt(42)) but...
It's undefined behaviour.
After the macro replacement
a = hypotenusa(++x, ++x);
becomes:
a = sqrt((++x) * (++x) + (++x) * (++x));
As you can see x is modified multiple times without any intervening sequence point(s). See What Every C Programmer Should Know About Undefined Behavior.
hypotenusa(++x, ++x) is undefined behavior.
It is up to the compiler which of the parameters gets incremented (and pushed) first - after the macro expansion, there are a total of four instances, and the sequence is not defined.
You should never increment the same variable multiple times in the same statement, to avoid this kind of issues. Using a variable twice in a macro can hide this error and make it really nasty.
The behavior of your macro is undefined, meaning any result is possible.
Chapter and verse
6.5 Expressions
...
2 If a side effect on a scalar object is unsequenced relative to either a different side effect
on the same scalar object or a value computation using the value of the same scalar
object,
the behavior is undefined. If there are multiple allowable orderings of the
subexpressions
of an expression, the behavior is undefined if such an unsequenced side
effect occurs in any
of the orderings.84)
84) This paragraph renders undefined statement expressions such as i = ++i + 1;
a[i++] = i;
while allowing i = i + 1;
a[i] = i;
Basically, if you modify the value of an object more than once in an expression, or both modify an object and use its value in a computation in an expression, the result will not be well-defined unless there is a sequence point in between those operations. With a few exceptions, C does not force left-to-right evaluation of expressions or function parameter evaluations, nor does it mandate that the side effect of ++ or -- be applied immediately after evaluation. Thus, the result of an expression like x++ * x++ will vary from platform to platform, program to program, or even potentially from run to run (although I've never seen that in practice).
For example, given the expression y = x++ * x++, the following evaluation sequence is possible:
t0 <- x // right hand x++, t0 == 2
t1 <- x // left hand x++, t1 == 2
y <- t0 * t1 // y = 2 * 2 == 4
x <- x + 1 // x == 3
x <- x + 1 // x == 4
which doesn't give you the result you expect if you assume left-to-right evaluation.

Checking order of operations in C 'if' statement

The following snippet of C code (where a and b are both type double) is what my question is about:
if(1.0-a < b && b <= 1.0)
Based on the order of operations shown in Wikipedia I understand this as evaluating the same as the following code snippet with parentheses:
if( ( (1.0-a) < b ) && ( b <= 1.0) )
which is what I want. I just want to double check my understanding that the two code snippets are indeed equivalent by the order of operations in C.
Note: obviously I could just use the second code snippet and make explicit what I want if() to evaluate; I ask because I've used the first snippet in my code for a while and I want to make sure my previous results from the code are okay.
Quick answer: yes, it is equivalent.
This means that the result of both code snippets is the same; the meaning is the same, but be careful when you talk about order of operations. It looks to me like your question here is about precedence and associativity. The latter tells you what an expression means, not the order of evaluation of its operands. To learn about order of evaluation, read about sequence points: Undefined behavior and sequence points
You ask about "order of operations", but I don't think that's what you really want to know.
The phrase "order of operations" refers to the time order in which operations are performed. In most cases, the order in which operations are performed within an expression is unspecified. The && operator is one of the few exceptions to this; it guarantees that its left operand is evaluated before its right operand (and the right operand might not be evaluated at all).
The parentheses you added can affect which operands are associated with which operators -- and yes, the two expressions
1.0-a < b && b <= 1.0
and
( (1.0-a) < b ) && ( b <= 1.0)
are equivalent.
Parentheses can be used to override operator precedence. They do not generally affect the order in which the operators are evaluated.
An example: this:
x + y * z
is equivalent to this:
x + (y * z)
because multiplication has a higher precedence than addition. But the three operands x, y, and z may be evaluated in any of the 6 possible orders:
x, y, z
x, z, y
y, x, z
y, z, x
z, x, y
z, y, x
The order makes no difference in this case (unless some of them are volatile), but it can matter if they're subexpressions with side effects.

Output of the following C program

What should be the output of this C program?
#include<stdio.h>
int main(){
int x,y,z;
x=y=z=1;
z = ++x || ++y && ++z;
printf("x=%d y=%d z=%d\n",x,y,z);
return 0;
}
The given output is :
x=2 y=1 z=1
I understand the output for x, but fail to see how y and z values don't get incremented.
This is a result of short-circuit evaluation.
The expression ++x evaluates to 2, and the compiler knows that 2 || anything always evaluates to 1 ("true") no matter what anything is. Therefore it does not proceed to evaluate anything and the values of y and z do not change.
If you try with
x=-1;
y=z=1;
You will see that y and z will be incremented, because the compiler has to evaluate the right hand side of the OR to determine the result of the expression.
Edit: asaerl answered your follow-up question in the comments first so I 'll just expand on his correct answer a little.
Operator precedence determines how the parts that make up an expression bind together. Because AND has higher precedence than OR, the compiler knows that you wrote
++x || (++y && ++z)
instead of
(++x || ++y) && ++z
This leaves it tasked to do an OR between ++x and ++y && ++z. At this point it would normally be free to select if it would "prefer" to evaluate one or the other expression first -- as per the standard -- and you would not normally be able to depend on the specific order. This order has nothing to do with operator precedence.
However, specifically for || and && the standard demands that evaluation will always proceed from left to right so that short-circuiting can work and developers can depend on the rhs expression not being evaluated if the result of evaluating the lhs tells.
In C, any thing other than 0 is treated as true, And the evaluation for the || start from left to right.
Hence the compiler will check first left operand and if it is true then the compiler will not check other operands.
ex. A || B - In this case if A is true then compiler will return true only, and will not check whether B is true or False. But if A is false then it will check B and return accordingly means if B is true then will return true or if B is false then it will return false.
In your program compiler first will check ++x(i.e 2) and anything other than 0 is true in C. Hence it will not check/increment other expressions.

Resources