In the code mentioned below sum1 variable is getting 46 as the answer when operators precedence left to right.But in sum2 answer is getting as 48 and it's precedence is right to left. Why those answers are getting different.
#include <stdio.h>
int func(int *k){
*k+=4;
return 3 * (*k)-1;
}
void main() {
int i = 10, j = 10, sum1, sum2;
sum1 = (i / 2) + func(&i);
sum2 = func(&j)+(j/2);
printf("%d\n",sum1);
printf("%d",sum2);
}
In the expression (i / 2) + func(&i), the compiler (or the C implementation generally) is free to evaluate either i / 2 first or func(&i) first. Similarly, in func(&j) + (j/2), the compiler is free to evaluate func(&j) or j/2 first.
Precedence is irrelevant. Precedence tells us how an expression is structured, but it does not fully determine the order in which it is evaluated. Precedence tells us that, in a * b + c * d, the structure must be (a * b) + (c * d). In a + b + c, precedence, in the form of left-to-right association for +, tells us the structure must be (a + b) + c. It does not tell that a must be evaluated before c. For example, in a() + b() + c(), the structure is (a() + b()) + c(), but the compiler may call the functions in any order, holding their results in temporary registers if needed, and then add the results.
In func(&j)+(j/2), there is no right-to-left precedence or association. No rule in the C standard says j/2 must be evaluated before func(&j).
A compiler might tend to evaluate subexpressions from left to right, in the absence of other constraints. However, various factors may alter that. For example, if one subexpression appears multiple times, the compiler might evaluate it early and retain its value for reuse. Essentially, the compiler builds a tree structure describing the expressions it needs to evaluate and then seeks optimal ways to evaluate them. It does not necessarily proceed left to right, and you cannot rely on any particular evaluation order.
Questions about Sequencing
The C standard has a rule, in C 2018 6.5 2, that says if a modification to an object, as occurs to i in the statement *k+=4;, is unsequenced relative to a value computation using the same object, as occurs for i in i / 2, then the behavior is undefined. However, this problem does not occur in this code because the modification and the value computation are indeterminately sequenced, not unsequenced: 6.5.2.2 10 says “… Every evaluation in the calling function (including other function calls) that is not otherwise specifically sequenced before or after the execution of the body of the called function is indeterminately sequenced with respect to execution of the called function.” C 5.1.2.3 3 says “… Evaluations A and B are indeterminately sequenced when A is sequenced either before or after B, but it is unspecified which…”
The program has undefined behavior because the order of evaluation of operands in an additive operator is unspecified and such evaluations of operands are unsequenced.
Pay attention to that within the assignment expression the variables i and j are being changed and these changes are not sequenced. Either i / 2 or j / 2 can be evaluated before the function call or vice versa.
The & preceding the variable name (&i or &j) is sending a pointer meaning that any changes to the variable are saved. In this case, the func call takes a variable and adds 4 to it then returns some value based on that variable. The function still changed the variable's value and since the equation is being processed left to right, each subsequent use of the variable has that change reflected.
sum1 = (i/2) +func(&i) (where i = 10)
--> sum1 = 5 + func(&i)
--> sum1 = 5 + 41 (and now i = 14)
sum2 = func(&j) + (j/2) (where j = 10)
--> sum2 = 41 + (j/2) (and now j = 14)
--> sum2 = 41 + 7
Related
I know that:
int b = 1, c = 2, d = 3, e = 4;
printf("%d %d %d", ++b, b, b++);
results in undefined behavior. Since
Modifying any object more than once between two sequence points is UB.
Undefined behavior and sequence points
But I don't know if:
int b = 1, c = 2, d = 3, e = 4;
printf("%d", b++ + ++c - --d - e--);
is also UB?
What I think is that increment/decrement operators will evalute first because of the precedence, between them right to left since the associativity . Then arithmetic operators will be evaluated left to right.
Which will just be
(b) + (c + 1) - (d - 1) - (e)
that is, 1 + (2 + 1) - (3 - 1) - (4)
= (2 - 4)
= -2
Is it right?
But I don't know if: ... is also UB?
It is not, but your reasoning about why is fuzzy.
What I think is that increment/decrement operators will evaluate first because of the precedence, between them right to left since the associativity . Then arithmetic operators will be evaluated left to right.
Precedence determines how the result is calculated. It doesn't say anything about the ordering of the side-effects.
There is no equivalent of precedence telling you when the side effects (the stored value of b has been incremented, the stored value of e has been decremented) are observable during the statement. All you know is that the variables have taken their new values before the next statement (ie, by the ;).
So, the reason this is well-defined is that it does not depend on those side-effects.
I deliberately hand-waved the language to avoid getting bogged down, but I should probably clarify:
"during the statement" really means "before the next sequence point"
"before the next statement (... ;)" really means "at the next sequence point"
See Order of evaluation:
There is a sequence point after the evaluation of all function arguments and of the function designator, and before the actual function call.
So really the side-effects are committed before the call to printf, so earlier than the ; at the end of the statement.
There is a gigantic difference between the expressions
b++ + ++c - --d - e--
(which is fine), and
x++ + ++x - --x - x--
(which is rampantly undefined).
It's not using ++ or -- that makes an expression undefined. It's not even using ++ or -- twice in the same expression. No, the problem is when you use ++ or -- to modify a variable inside an expression, and you also try to use the value of that same variable elsewhere in the same expression, and without an intervening sequence point.
Consider the simpler expression
++z + z;
Now, obviously the subexpression ++z will increment z. So the question is, does the + z part use the old or the new value of z? And the answer is that there is no answer, which is why this expression is undefined.
Remember, expressions like ++z do not just mean, "take z's value and add 1". They mean, "take z's value and add 1, and store the result back into z". These expressions have side effects. And the side effects are at the root of the undefinedness issue.
In the following function:
int fun(int *k) {
*k += 4;
return 3 * (*k) - 1;
}
void main() {
int i = 10, j = 10, sum1, sum2;
sum1 = (i / 2) + fun(&i);
sum2 = fun(&j) + (j / 2);
}
You'd get sum 1 to equal 46, and sum 2 to equal 48. How would the function run if there were no precedence rules?
How drastically difference would things run without consistent precedence rules?
The precedence rules tell us how an expression is structured, not how it is evaluated. In sum1 = (i / 2) + fun(&i);, the rules tell us things including:
i / 2 is grouped together because it is in parentheses; it cannot form, for example, (sum1 = i) / 2 + fun(&i);.
(i / 2) and fun(&i) are grouped together because + has higher precedence than =, making sum1 = ((i / 2) + fun(&i); rather than (sum1 = (i / 2)) + fun(&i);.
The precedence rules do not tell us whether i / 2 or fun(&i) is evaluated first. In fact, no rules in the C standard specify whether i / 2 or fun(&i) is evaluated first. The compiler may choose.
If i / 2 is evaluated first, the result will be 10 / 2 + 41 and then 5 + 41 and finally 46. if fun(&i) is evaluated first, the result will be 14 / 2 + 41 and then 7 + 41 and finally 48. Your compiler chose the former. It could have chosen the latter.
How would the function run if there were no precedence rules?
If there were no rules, we would not know how the function would be executed. The rules are what tell us how it will be executed.
Some comments assert that the behavior of this program is undefined. That is incorrect. This misunderstanding comes from C 2018 6.5 2, which says:
If a side effect on a scalar object is unsequenced relative to either a different side effect on the same scalar object or a value computation using the value of the same scalar object, the behavior is undefined…
In your code, i / 2 uses i and fun(&i) contains a “side effect” on i (changing its value via an assignment). If these were unsequenced, the behavior would be undefined. However, there is a sequence point after evaluating the argument to fun and before calling it, and there are sequence points after each full expression in fun, including its return statement. Thus, there is some sequencing of the uses of i and the side effects on it. This sequencing is incompletely determined by the rules of the C standard, but it is, as defined by the standard, indeterminately sequenced, not unsequenced.
This question already has answers here:
Why are these constructs using pre and post-increment undefined behavior?
(14 answers)
Closed 1 year ago.
Why i can't do the following in Objective-C?
a = (a < 10) ? (a++) : a;
or
a = (a++)
The ++ in a++ is a post-increment. In short, a = a++ does the following:
int tmp = a; // get the value of a
a = a + 1 //post-increment a
a = tmp // do the assignment
As noted by others, in C this is actually undefined behavior and different compilers can order the operations differently.
Try to avoid using ++ and -- operators when you can. The operators introduce side effects which are hard to read. Why not simply write:
if (a < 10) {
a += 1;
}
To extend Sulthan's answer, there are several problem with your expressions, at least the simple assignment (case 2).
A. There is no sense in doing so. Even a++ has a value (is a non-void expression) that can be assigned, it automatically assigns the result to a itself. So the best you can expect is equivalent to
a++;
The assignment cannot improve the assignment at all. But this is not the error message.
B. Sulthans replacement of the statement is a better case. It is even worse: The ++ operator has the value of a (at the beginning of the expression) and the effect to increment a at some point in future: The increment can be delayed up to the next sequence point.
The side effect of updating the stored value of the operand shall occur between the previous and the next sequence point.
(ISO/IEC 9899:TC3, 6.5.2.4, 2)
But the assignment operator = is not a sequence point (Annex C).
The following are the sequence points described in 5.1.2.3:
[Neither assignment operator nor ) is included in the list]
Therefore the expression can be replaced with what Sulthan said:
int tmp = a; // get the value of a
a = a + 1 //post-increment a
a = tmp // do the assignment
with the result, that a still contains the old value.
Or the expression can be replaced with this code …:
int tmp = a; // get the value of a
a = tmp // assignemnt
a = a + 1 // increment
… with a different result (a is incremented). This is what the error message says: There is no defined sequence (order the operations has to be applied.)
You can insert a sequence point using the comma operator , (what is the primary use of it), …
a++, a=a; // first increment, then assign
… but this shows the whole leak of meaning of what you want to do.
It is the same with your first example. Though ? is a sequence point …:
The following are the sequence points described in 5.1.2.3:
… The end of the first operand of the following operators: […] conditional ? (6.5.15);[…].
… both the increment (a++) and the assignment (a=) are after the ? operator is evaluated and therefore unsequenced ("in random order") again.
To make the comment "Be careful" more concrete: Don't use an incremented object in an expression twice. (Unless there is a clear sequence point).
int a = 1;
… = a++ * a;
… evaluates to what? 2? 1? Undefined, because the increment can take place after reading a "the second time".
BTW: The Q is not related to Objective-C, but to pure C. There is no Objective-C influence to C in that point. I changed the tagging.
I am writing a small test app in C with GCC 4.8.4 pre-installed on my Ubuntu 14.04. And I got confused for the fact that the expression a=(b++); behaves in the same way as a=b++; does. The following simple code is used:
#include <stdint.h>
#include <stdio.h>
int main(int argc, char* argv[]){
uint8_t a1, a2, b1=10, b2=10;
a1=(b1++);
a2=b2++;
printf("a1=%u, a2=%u, b1=%u, b2=%u.\n", a1, a2, b1, b2);
}
The result after gcc compilation is a1=a2=10, while b1=b2=11. However, I expected the parentheses to have b1 incremented before its value is assigned to a1.
Namely, a1 should be 11 while a2 equals 10.
Does anyone get an idea about this issue?
However, I expected the parentheses to have b1 incremented before its value is assigned to a1
You should not have expected that: placing parentheses around an increment expression does not alter the application of its side effects.
Side effects (in this case, it means writing 11 into b1) get applied some time after retrieving the current value of b1. This could happen before or after the full assignment expression is evaluated completely. That is why a post-increment will remain a post-increment, with or without parentheses around it. If you wanted a pre-increment, place ++ before the variable:
a1 = ++b1;
Quoting from the C99:6.5.2.4:
The result of the postfix ++ operator is the value of the operand.
After the result is obtained, the value of the operand is incremented.
(That is, the value 1 of the appropriate type is added to it.) See the
discussions of additive operators and compound assignment for
information on constraints, types, and conversions and the effects of
operations on pointers. The side effect of updating the stored value
of the operand shall occur between the previous and the next sequence
point.
You can look up the C99: annex C to understand what the valid sequence points are.
In your question, just adding a parentheses doesn't change the sequence points, only the ; character does that.
Or in other words, you can view it like there's a temporary copy of b and the side-effect is original b incremented. But, until a sequence point is reached, all evaluation is done on the temporary copy of b. The temporary copy of b is then discarded, the side effect i.e. increment operation is committed to the storage,when a sequence point is reached.
Parentheses can be tricky to think about. But they do not mean, "make sure that everything inside happens first".
Suppose we have
a = b + c * d;
The higher precedence of multiplication over addition tells us that the compiler will arrange to multiply c by d, and then add the result to b. If we want the other interpretation, we can use parentheses:
a = (b + c) * d;
But suppose that we have some function calls thrown into the mix. That is, suppose we write
a = x() + y() * z();
Now, while it's clear that the return value of y() will be multiplied by the return value of z(), can we say anything about the order that x(), y(), and z() will be called in? The answer is, no, we absolutely cannot! If you're at all unsure, I invite you to try it, using x, y, and z functions like this:
int x() { printf("this is x()\n"); return 2; }
int y() { printf("this is y()\n"); return 3; }
int z() { printf("this is z()\n"); return 4; }
The first time I tried this, using the compiler in front of me, I discovered that function x() was called first, even though its result is needed last. When I changed the calling code to
a = (x() + y()) * z();
the order of the calls to x, y, and z stayed exactly the same, the compiler just arranged to combine their results differently.
Finally, it's important to realize that expressions like i++ do two things: they take i's value and add 1 to it, and then they store the new value back into i. But the store back into i doesn't necessarily happen right away, it can happen later. And the question of "when exactly does the store back into i happen?" is sort of like the question of "when does function x get called?". You can't really tell, it's up to the compiler, it usually doesn't matter, it will differ from compiler to compiler, if you really care, you're going to have to do something else to force the order.
And in any case, remember that the definition of i++ is that it gives the old value of i out to the surrounding expression. That's a pretty absolute rule, and it can not be changed just by adding some parentheses! That's not what parentheses do.
Let's go back to the previous example involving functions x, y, and z. I noticed that function x was called first. Suppose I didn't want that, suppose I wanted functions y and z to be called first. Could I achieve that by writing
x = z() + ((y() * z())?
I could write that, but it doesn't change anything. Remember, the parentheses don't mean "do everything inside first". They do cause the multiplication to happen before the addition, but the compiler was already going to do it that way anyway, based on the higher precedence of multiplication over addition.
Up above I said, "if you really care, you're going to have to do something else to force the order". What you generally have to do is use some temporary variables and some extra statements. (The technical term is "insert some sequence points.") For example, to cause y and z to get called first, I could write
c = y();
d = z();
b = x();
a = b + c * d;
In your case, if you wanted to make sure that the new value of b got assigned to a, you could write
c = b++;
a = b;
But of course that's silly -- if all you want to do is increment b and have its new value assigned to a, that's what prefix ++ is for:
a = ++b;
Your expectations are completely unfounded.
Parentheses have no direct effect on the order of execution. They don't introduce sequence points into the expression and thus they don't force any side-effects to materialize earlier than they would've materialized without parentheses.
Moreover, by definition, post-increment expression b++ evaluates to the original value of b. This requirement will remain in place regardless of how many pair of parentheses you add around b++. Even if parentheses somehow "forced" an instant increment, the language would still require (((b++))) to evaluate to the old value of b, meaning that a would still be guaranteed to receive the non-incremented value of b.
Parentheses only affects the syntactic grouping between operators and their operands. For example, in your original expression a = b++ one might immediately ask whether the ++ apples to b alone or to the result of a = b. In your case, by adding the parentheses you simply explicitly forced the ++ operator to apply to (to group with) b operand. However, according to the language syntax (and the operator precedence and associativity derived from it), ++ already applies to b, i.e. unary ++ has higher precedence than binary =. Your parentheses did not change anything, it only reiterated the grouping that was already there implicitly. Hence no change in the behavior.
Parentheses are entirely syntactic. They just group expressions and they are useful if you want to override the precedence or associativity of operators. For example, if you use parentheses here:
a = 2*(b+1);
you mean that the result of b+1 should be doubled, whereas if you omit the parentheses:
a = 2*b+1;
you mean that just b should be doubled and then the result should be incremented. The two syntax trees for these assignments are:
= =
/ \ / \
a * a +
/ \ / \
2 + * 1
/ \ / \
b 1 2 b
a = 2*(b+1); a = 2*b+1;
By using parentheses, you can therefore change the syntax tree that corresponds to your program and (of course) different syntax may correspond to different semantics.
On the other hand, in your program:
a1 = (b1++);
a2 = b2++;
parentheses are redundant because the assignment operator has lower precedence than the postfix increment (++). The two assignments are equivalent; in both cases, the corresponding syntax tree is the following:
=
/ \
a ++ (postfix)
|
b
Now that we're done with the syntax, let's go to semantics. This statement means: evaluate b++ and assign the result to a. Evaluating b++ returns the current value of b (which is 10 in your program) and, as a side effect, increments b (which now becomes 11). The returned value (that is, 10) is assigned to a. This is what you observe, and this is the correct behaviour.
However, I expected the parentheses to have b1 incremented before its value is assigned to a1.
You aren't assigning b1 to a1: you're assigning the result of the postincrement expression.
Consider the following program, which prints the value of b when executing assignment:
#include <iostream>
using namespace std;
int b;
struct verbose
{
int x;
void operator=(int y) {
cout << "b is " << b << " when operator= is executed" << endl;
x = y;
}
};
int main() {
// your code goes here
verbose a;
b = 10;
a = b++;
cout << "a is " << a.x << endl;
return 0;
}
I suspect this is undefined behavior, but nonetheless when using ideone.com I get the output shown below
b is 11 when operator= is executed
a is 10
OK, in a nutshell: b++ is a unary expression, and parentheses around it won't ever take influence on precedence of arithmetic operations, because the ++ increment operator has one of the highest (if not the highest) precedence in C. Whilst in a * (b + c), the (b + c) is a binary expression (not to be confused with binary numbering system!) because of a variable b and its addend c. So it can easily be remembered like this: parentheses put around binary, ternary, quaternary...+INF expressions will almost always have influence on precedence(*); parentheses around unary ones NEVER will - because these are "strong enough" to "withstand" grouping by parentheses.
(*)As usual, there are some exceptions to the rule, if only a handful: e. g. -> (to access members of pointers on structures) has a very strong binding despite being a binary operator. However, C beginners are very likely to take quite awhile until they can write a -> in their code, as they will need an advanced understanding of both pointers and structures.
The parentheses will not change the post-increment behaviour itself.
a1=(b1++); //b1=10
It equals to,
uint8_t mid_value = b1++; //10
a1 = (mid_value); //10
Placing ++ at the end of a statement (known as post-increment), means that the increment is to be done after the statement.
Even enclosing the variable in parenthesis doesn't change the fact that it will be incremented after the statement is done.
From learn.geekinterview.com:
In the postfix form, the increment or decrement takes place after the value is used in expression evaluation.
In prefix increment or decrement operation the increment or decrement takes place before the value is used in expression evaluation.
That's why a = (b++) and a = b++ are the same in terms of behavior.
In your case, if you want to increment b first, you should use pre-increment, ++b instead of b++ or (b++).
Change
a1 = (b1++);
to
a1 = ++b1; // b will be incremented before it is assigned to a.
To make it short:
b++ is incremented after the statement is done
But even after that, the result of b++ is put to a.
Because of that parentheses do not change the value here.
In code snippet below
int jo=50;
if( jo =(rand()%100), jo!=50)
{
printf("!50");
}
% has highest precedence so rand()%100 will get executed first
!= has the precedence greater than = so jo != 50 should get execute right ?
, has the least precedence
still when i execute assignment occurs first then != and then , . I get an output !50 why ??
The issue is "sequence points":
http://www.angelikalanger.com/Articles/VSJ/SequencePoints/SequencePoints.html
Problematic vs. Safe Expressions
What is it that renders the
assignment x[i]=i++ + 1; a problematic one whereas the assignment
i=2; is harmless, in the sense that its result is well-defined and
predictable? The crux is that in the expression x[i]=i++ + 1; there
are two accesses to variable i and one of the accesses, namely the
i++, is a modifying access. Since the order of evaluation between
sequence points is not defined we do not know whether i will be
modified before it will be read or whether it will be read before the
modification. Hence the root of the problem is multiple access to a
variable between sequence points if one the accesses is a
modification.
Here is another example. What will happen here if i and j have
values 1 and 2 before the statement is executed?
f(i++, j++, i+j);
Which value will be passed to function f as the third argument?
Again, we don't know. It could be any of the following: 3, 4, or 5. It
depends on the order in which the function arguments are evaluated.
The common misconception here is that the arguments would be evaluated
left to right. Or maybe right to left? In fact, there is no order
whatsoever mandated by the language definition.
Precedence does not control the order of execution. Precedence only controls the grouping - that is, precedence says what the operands are for each operation, not when each operation happens.
In this example, the precedence of % is irrelevant due to the parentheses - these say that the operands of % are rand() and 100.
The precedece of , being lower than that of = and != tells us that the operands of = are jo and (rand()%100), and that the operands of != are jo and 50.
The operands of , are then jo = (rand() % 100) and jo != 50.
The definition of the , operator says that the first operand is evaluated, then there is a sequence point, and then the second operand is evaluated. So this case, jo = (rand() % 100) is fully evaluated, which stores the result of rand() % 100 into jo; and then jo != 50 is evaluated. The value of the overall expression is the value of jo != 50.
Well, sequence points is the right answer. but let's translate from the textbook-ese.
The comma operator has a special property: it makes sure that what's on its left hand side is evaluated first. So, when you get to the expression
jo =(rand()%100), jo!=50
even though the != binds more tightly than ',', so that the expressiojn, fully parenthesized is
(jo =(rand()%100)),(jo!=50)
the first part is evaluated first.
To remember this, you can pronouce or read the comma operator as "and then", so
j0=(rand()%100)
"and then"
jo!=50.
It's a mistake to think of "precedence" as "done first".
Consider the following code snippet:
f() + g() + h()
Which has add operation has higher precedence, the one that sums the results of f() and g(), or the one that sums the results of that and h()?
It's a trick question, because there is no need to invoke "precedence" at all. But there is still an order of operations, because function calls in C introduce "sequence points", which is how C allows you to determine "what happens when", as it were.
In your particular code, you have a comma operator—which is quite different from the comma punctuator in function arguments—in this part:
jo = (rand() % 100), jo != 50
The comma operator introduces a sequence point (as does the function call to rand), so we know that rand runs and produces a value, then that value % 100 is computed and assigned to jo, and finally jo is compared with 50.
(There is a sequence point after the evaluation of the controlling expression in the if as well, and one at each statement-ending semicolon.)