Operator associativity in C specifically prefix and postfix increment and decrement - c

In C operation associativity is as such for increment, decrement and assignment.
2. postfix ++ and --
3. prefix ++ and --
16. Direct assignment =
The full list is found here Wikipedia Operators in C
My question is when we have
int a, b;
b = 1;
a = b++;
printf("%d", a); // a is equal to 1
b = 1;
a = ++b;
printf("%d", a); //a is equal to 2
Why is a equal to 1 with b++ when the postfix increment operator should happen before the direct assignment?
And why is the prefix increment operator different than the postfix when they are both before the assignment?
I'm pretty sure I don't understand something very important when it comes to operation associativity.

The postfix operator a++ will increment a and then return the original value i.e. similar to this:
{ temp=a; a=a+1; return temp; }
and the prefix ++a will return the new value i.e.
{ a=a+1; return a; }
This is irrelevant to the operator precedence.
(And associativity governs whether a-b-c equals to (a-b)-c or a-(b-c).)

Operator precedence and associativity does not tell you what happens before and what happens after. Operator precedence/associativity has nothing to do with it. In C language temporal relationships like "before" or "after" are defined by so called sequence points and only by sequence points (and that's a totally separate story).
Operator precedence/associativity simply tells you which operands belong to which operators. For example, the expression a = b++ can be formally interpreted as (a = b)++ and as a = (b++). Operator precedence/associativity is this case simply tells you that the latter interpretation is correct and the former is incorrect (i.e. ++ applies to b and not to the result of a = b).
That, once again, does not mean that b should be incremented first. Operator precedence/associativity, once again, has noting to do with what happens "first" and what happens "next". It simply tells you that the result of b++ expression is assigned to a. By definition, the result of b++ (postfix increment) is the original value of b. This is why a will get the original value of b, which is 1. When the variable b will get incremented is completely irrelevant, as long as a gets assigned b's original value. The compiler is allowed to evaluate this expression in any order and increment b at any time: anything goes, as long as a somehow gets the original value of b (and nobody really cares how that "somehow" works internally).
For example, the compiler can evaluate a = b++ as the following sequence of elementary operations
(1) a := b
(2) b := b + 1
or it can evaluate it as follows
(1) b := b + 1
(2) a = b - 1
Note that in the first case b is actually incremented at the end, while in the second case b is incremented first. But in both cases a gets the same correct value - the original value of b, which is what it should get.
But I have to reiterate that the above two examples are here just for illustrative purposes. In reality, expressions like a = ++b and a = b++ have no sequence points inside, which means that from your point of view everything in these expressions happens simultaneously. There's no "before", "after", "first", "next" or "last". Such expressions are "atomic" in a sense that they cannot be meaningfully decomposed into a sequence of smaller steps.

As AndreyT already pointed out, precedence and associativity don't tell you about order of evaluation. They only tell you about grouping. For example, precedence is what tells use that a*b+c is grouped as (a*b)+c instead of a*(b+c). The compiler is free to evaluate a, b and c in any order it sees fit with either of those expressions. Associativity tells you about grouping when you have operators of the same precedence, most often, the same operators. For example, it's what tells you that a-b-c is equivalent to (a-b)-c, not a-(b-c) (otherwise stated, subtraction is left associative).
Order of evaluation is defined by sequence points. There's a sequence point at the end of a full expression (among other things). At the sequence point, all the previous evaluations have to have taken place, and none of the subsequent evaluations can have taken place yet.
Looking at your specific examples, in a=b++;, the result is mostly from the definition of post-increment itself. A post-increment yields the previous value of the variable, and sometime before the next sequence point, the value of that variable will be incremented. A pre-increment yields the value of the variable with the increment applied. In neither case, however, does that mean the variable has to be incremented in any particular order relative to the assignment. For example, in your pre-increment example, the compiler is entirely free to do something equivalent to:
temp = b+1;
a = temp;
b = b + 1;
Likewise, in the post-increment version, the variable can be incremented before or after the assignment:
a = b;
b = b + 1;
or:
temp = b;
b = b + 1;
a = temp;
Either way, however, the value assigned to a must be the value of b before it's incremented.

Related

Will this expression evaluate to true or false (1 or 0) in C?

#include<stdio.h>
int main()
{
int a=4;
int b=4;
int c= a++ < ++b? 1 : 0;
printf ("%d",c);
}
It is known that there is a sequence point at ?, which means that both the prefix and postfix operations have to be completed by that point. Also it is known(?) that b is incremented before the comparison. However, is a incremented before or after the comparison?
If it is incremented before the < test, then the Boolean evaluates to false and c is set to 0, else to true with c being set to 1. In my compiler, it evaluates to true, which means a++ is performed after the comparison operation with c being set to 1.
Is this behavior part of the specification though?
I modified it to
#include<stdio.h>
int main()
{
int a=4;
int b=4;
int d=2;
int c= a++ + d < ++b + d? 1 : 0;
printf ("%d",c);
}
and it still evaluates to 1. The postfix has to complete before the ?, but does that really ensure that it happens after the comparison < ?
a++ returns the value of a before the increment. ++b returns the value of b after the increment. Thus this evaluates to 1.
As neither a nor b are used more than once in the expression, no undefined behavior exists.
Also it is known(?) that b is incremented before the comparison.
However, is a incremented before or after the comparison?
This is a subtle point, but it's important to understand what's really going on here.
Both the subexpressions a++ and ++b do two things. They compute a new value to be used in the surrounding expression, and they update the stored value of the variable they're operating on.
So a++ does this:
it yields the old value of a (4) out to the surrounding expression
it stores a new value (5) into a.
And ++b does this:
it yields the new value of b (4+1 or 5) out to the surrounding expression
it stores a new value (5) into b.
Notice that in both cases it's thing 1 that the < operator cares about. And, in both cases, thing 1 is an absolute definition, it doesn't depend on timing.
Or, in other words, asking "Is a/b incremented before or after the comparison?" is not really the right question. The values a and b+1 participate in the comparison, and that's it.
Where the timing comes in is things 2. We don't know, precisely, when the new value gets stored back into a. Nor do we know precisely when the new value gets stored back into b. All we know is that those stores will happen sometime before the next sequence point (which, as you correctly note, in this case is the ? part of the ternary operator).
But nothing depends on those timings, so there's no undefined behavior here.
Undefined behavior comes in when either
the variable that's modified (a or b) also has its value independently used elsewhere in the expression, meaning that we don't know whether that use uses the old or the new value
the same variable is modified twice, meaning that we don't know which of the two modifications "wins"
But, again, neither of those problems occurs here, so the expression is well-defined.
This is what C11 says about the two:
Postfix ++:
The result of the postfix ++ operator is the value of the operand. As
a side effect, the value of the operand object is incremented (that
is, the value 1 of the appropriate type is added to it).....
The value computation of the result is sequenced before the side
effect of updating the stored value of the operand.
Prefix ++:
The value of the operand of the prefix ++ operator is incremented. The
result is the new value of the operand after incrementation. The
expression ++E is equivalent to (E+=1).
Aside:
int c= a++ < ++b? 1 : 0;
The operator < returns either 1 or 0, or true or false. So the above statement can also be written as:
int c = a++ < ++b;
From the C Standard (6.5.2.4 Postfix increment and decrement operators)
2 The result of the postfix ++ operator is the value of the operand.
As a side effect, the value of the operand object is incremented (that
is, the value 1 of the appropriate type is added to it).
So in this declaration
int c= a++ < ++b? 1 : 0;
the value of the sub-expression a++ used in the initializer is the value of the operand a before its increment.
On the other hand (The C Standard (6.5.3.1 Prefix increment and decrement operators) )
2 The value of the operand of the prefix ++ operator is incremented.
The result is the new value of the operand after incrementation.
So the value of the sub-expression ++b is the value after incrementing b.
Hence you have in fact
int c = 4 < 5 ? 1 : 0;
As for the sequence point then to demonstrate it you could write for example
int c = a++ < b++ ? a : b;
In this case the variable c will have the value of the variable b after applying to it the side effect of incrementing that is to 5.

Operator associativity, precedence

I just wonder if, for the following code, the compiler uses associativity/precedence alone or some other logic to evaluate.
int i = 0, k = 0;
i = k++;
If we evaluate based on associativity and precedence, postfix ++ has higher precedence than =, so k++(which becomes 1) is evaluated first and then comes =, now the value of k which is 1 is assigned to i.
So the value of i and k would be 1. However, the value of i is 0 and k is 1.
So I think that the compiler splits this i = k++; into two (i = k; k++;). So here compiler is not going for the statements associativity/precedence, it splits the line as well. Can someone explain how the compiler resolves these kinds of statements?
++ does two separate things.
k++ does two things:
It has the value of k before any increment is performed.
It increments k.
These are separate:
Producing the value of k occurs as part of the main evaluation of i = k++;.
Incrementing k is a side effect. It is not part of the main evaluation. The program may increment the value of k after evaluating the rest of the expression or during it. It may even increment the value before the rest of the expression, as long as it “remembers” the pre-increment value to use for the expression.
Precedence and associativity are not involved.
This effectively has nothing to do with precedence or associativity. The increment part of a ++ operator is always separate from the main evaluation of an expression. The value used for k++ is always the value of k before the increment regardless of what other operators are present.
Supplement
It is important to understand that the increment part of ++ is detached from the main evaluation and is sort of “floating around” in time–it is not anchored to a certain spot in the code, and you do not control when it occurs. This is important because if there is another use or modification of the operand, such as in k * k++, the increment can occur before, during, or after the main evaluation of the other occurrence. When this happens, the C standard does not define the behavior of the program.
Postfix operators have higher precedence than assignment operators.
This expression with the assignment operator
i = k++
contains two operands.
It is equivalently can be rewritten like
i = ( k++ );
The value of the expression k++ is 0. So the variable i will get the value 0.
The operands of the assignment operator can be evaluated in any order.
According to the C Standard (6.5.2.4 Postfix increment and decrement operators)
2 The result of the postfix ++ operator is the value of the operand.
As a side effect, the value of the operand object is incremented (that
is, the value 1 of the appropriate type is added to it).
And (6.5.16 Assignment operators)
3 An assignment operator stores a value in the object designated by
the left operand. An assignment expression has the value of the left
operand after the assignment,111) but is not an lvalue. The type of an
assignment expression is the type the left operand would have after
lvalue conversion. The side effect of updating the stored value of
the left operand is sequenced after the value computations of the left
and right operands. The evaluations of the operands are unsequenced.
Unlike C++, C does not have "pass by reference". Only "pass by value". I'm going to borrow some C++ to explain. Let's implement the functionality of ++ for both postfix and prefix as regular functions:
// Same as ++x
int inc_prefix(int &x) { // & is for pass by reference
x += 1;
return x;
}
// Same as x++
int inc_postfix(int &x) {
int tmp = x;
x += 1;
return tmp;
}
So your code is now equivalent to:
i = inc_postfix(k);
EDIT:
It's not completely equivalent for more complex things. Function calls introduces sequence points for instance. But the above is enough to explain what happens for OP.
It's similar to (only with an additional sequence point for illustration):
i = k; // i = 0
k = k + 1; // k = 1
Operator associativity doesn't apply here. Operator precedence merely states which operand that sticks to which operator. It's not particularly relevant in this case, it just says that the expression should be parsed as i = (k++); and not as (i = k)++; which wouldn't make any sense.
From there on, how this expression is evaluated/executed is specified by specific rules for each operator. The postfix operator is specified to behave as (6.5.2.4):
The value computation of the result is sequenced before the side effect of
updating the stored value of the operand.
That is, k++ is guaranteed to evaluate to 0 and then at some point later on, k is increased by 1. We don't really know when, only that it happens somewhere between the point when k++ is evaluated but before the next sequence point, in this case the ; at the end of the line.
The assignment operator behaves as (6.5.16):
The side effect of updating the stored value of the left operand is
sequenced after the value computations of the left and right operands.
In this case, the right operand of = has its value computed before updating the left operand.
In practice, this means that the executable can look as either this:
k is evaluated to 0
set i to 0
increase k by 1
semicolon/sequence point
Or this:
k is evaluated to 0
increase k by 1
set i to 0
semicolon/sequence point
Precedence and associativity only affect how operators and operands are associated with each other - they do not affect the order in which expressions are evaluated. Precedence rules dictate that
i = k++
is parsed as
i = (k++)
instead of something like
(i = k)++
The postfix ++ operator has a result and a side effect. In the expression
i = k++
the result of k++ is the current value of k, which gets assigned to i. The side effect is to increment k.
It's logically equivalent to writing
tmp = k
i = tmp
k = k + 1
with the caveat that the assignment to i and the update to k can happen in any order - the operations can even be interleaved with each other. What matters is that i gets the value of k before the increment and that k gets incremented, not necessarily the order in which those operations occur.
The fundamental issue here is that precedence is not the right way to think about what
i = k=+;
means.
Let's talk about what k++ actually means. The definition of k++ is that if gives you the old value of k, and then adds 1 to the stored value of k. (Or, stated another way, it takes the old value of k, plus 1, and stores it back into k, while giving you the old value of k.)
As far as the rest of the expression is concerned, the important thing is what the value of k++ is. So when you say
i = k++;
the answer to the question of "What gets stored in i?" is, "The old value of k".
When we answer the question of "What gets stored in i?", we don't think about precedence at all. We think about the meaning of the postfix ++ operator.
See also this older question.
Postscript: The other thing you have to be really careful about is when you think about the side question, "When does it store the new value into k? It turns out that's a really hard question to answer, because the answer is not as well defined as you might like. The new value gets stored back into k sometime before the end of the larger expression it's in (formally, "before the next sequence point"), but we don't know whether it happens before or after, say, the point at which the thing gets stored into i, or before or after other interesting points in the expression.
Ahh, this is quite an interesting question. To help you understand better, this is what actually happens.
I'm going to try to explain using a bit of operator overloading concepts from C++, so bear with me if you do not know C++.
This is how you would overload the postfix-increment operator:
int operator++(int) // Note that the 'int' parameter is just a C++ way of saying that this is the postfix and not prefix operator
{
int copy = *this; // *this just means the current object which is calling the function
*this += 1;
return copy;
}
Essentially what the postfix-increment operator does is that it creates a copy of the operand, increases the original variable, and then returns the copy.
In your case of i = k++, k++ does actually happen first but the value returned is actually k (think of it like a function call). This then gets assigned to i.

How is the increment operator evaluated in C programs?

I have two expressions:
int a=5;
int c=++a;// c=6, a=6
int b=a++;// b=6, a=7
In the second instruction, the increment is evaluated first and in the third instruction, the increment is evaluated after the assignment.
I know that the increment operator has a higher priority. Can anyone explain to me why it's evaluated after assignment in the third expression?
The result is not related to the order of operations but to the definition of prefix ++ and postfix ++.
The expression ++a evaluates to the incremented value of a. In contrast, the expression a++ evaluates to the current value of a, and a is incremented as a side effect.
Section 6.5.2.4p2 of the C standard says the following about postfix ++:
The result of the postfix ++ operator is the value of the
operand. As a side effect, the value of the operand object is
incremented (that is, the value 1 of the appropriate type is added to
it).
And section 6.5.3.1p2 says the following about prefix ++:
The value of the operand of the prefix ++ operator is incremented. The
result is the new value of the operand after incrementation. The
expression ++E is equivalent to (E+=1)
++a and a++ are simply different operators, despite the same symbol ++. One is prefix-increment, one is postfix-increment. This has nothing to do with priority compared to assignment. (just like a - b and -a are different operators despite the same symbol -.)
EDIT: It was pointed out that this is about C and not C++... oops. So, the following explanation may be confusing if you only know C; all you need to know is that int& is a reference to an int, so it's like having a pointer but without the need to dereference it, so modifying a inside of these functions actually modifies the variable you passed into the functions.
You could imagine them like functions:
int prefixIncrement(int& a) {
return ++a;
}
...is the same as:
int prefixIncrement(int& a) {
a += 1;
return a;
}
And:
int postfixIncrement(int& a) {
return a++;
}
...is the same as:
int postfixIncrement(int& a) {
int old = a;
a += 1;
return old;
}
For nitpickers: Yes, actually we'd need move semantics on the return value for postfixIncrement.

Priority operators in C

I found this text (source: https://education.cppinstitute.org/) and I'm trying to understand the second instruction.
Can you answer the question of what distinguishes these two instructions?
c = *p++;
and
c = (*p)++;
We can explain: the first assignment is as if the following two disjoint instructions have been performed;
c = *p;
p++;
In other words, the character pointed to by p is copied to the c variable; then, p is increased and points to the next element of the array.
The second assignment is performed as follows:
c = *p;
string[1]++;
The p pointer is not changed and still points to the second element of the array, and only this element is increased by 1.
What I don't understand is why it is not incremented when the = operator has less priority than the ++ operator.
With respect to this statement expression
c = (*p)++;
, you say
What i dont understand is why [p] is not incremented when the =
operator has less priority than the ++ operator.
There is a very simple explanation: p is not incremented as a result of evaluating that expression because it is not the operand of the ++ operator.
That is in part exactly because the = operator has lower precedence: because the precedence of = is so low, the operand of ++ is the expression (*p) rather than the expression c = (*p). Note in particular that p itself is not even plausibly in the running to be the operand in that case, unlike in the variation without parentheses.
Moving on, the expression (*p) designates the thing to which p points, just as *p all alone would do. Context suggests that at that time, that's the same thing designated by string[1]. That is what gets incremented, just as the text says, and its value prior to the increment is the result of the postfix ++ operation.
What I don't understand is why it is not incremented when the = operator has less priority than the ++ operator.
The value for example of the expression
x++
is the value of x before incrementing.
So if you'll write
y = x++;
then the variable y gets the value of x before its incrementing.
From the C Standard (6.5.2.4 Postfix increment and decrement operators)
2 The result of the postfix ++ operator is the value of the operand.
As a side effect, the value of the operand object is incremented (that
is, the value 1 of the appropriate type is added to it). ... The
value computation of the result is sequenced before the side effect of
updating the stored value of the operand. ...
If instead of the expression
c = (*p)++;
you'll write
c = ++(*p);
then you get the expected by you result. This demonstrates the difference between the postfix increment operator ++ and the prefix (unary) increment operator ++.
When the ++ is following a variable, the variable is incremented after it has been used.
So when you have
y = x++;
x is incremented after y gets the value of x.
This is how it works for the -- operator also.

C operator order

Why is the postfix increment operator (++) executed after the assignment (=) operator in the following example? According to the precedence/priority lists for operators ++ has higher priority than = and should therefore be executed first.
int a,b;
b = 2;
a = b++;
printf("%d\n",a);
will output a = 2.
PS: I know the difference between ++b and b++ in principle, but just looking at the operator priorities these precende list tells us something different, namely that ++ should be executed before =
++ is evaluated first. It is post-increment, meaning it evaluates to the value stored and then increments. Any operator on the right side of an assignment expression (except for the comma operator) is evaluated before the assignment itself.
It is. It's just that, conceptually at least, ++ happens after the entire expression a = b++ (which is an expression with value a) is evaluated.
Operator precedence and order of evaluation of operands are rather advanced topics in C, because there exists many operators that have their own special cases specified.
Postfix ++ is one such special case, specified by the standard in the following manner (6.5.2.4):
The value computation of the result is sequenced before the side
effect of updating the stored value of the operand.
It means that the compiler will translate the line a = b++; into something like this:
Read the value of b into a CPU register. ("value computation of the result")
Increase b. ("updating the stored value")
Store the CPU register value in a.
This is what makes postfix ++ different from prefix ++.
The increment operators do two things: add +1 to a number and return a value. The difference between post-increment and pre-increment is the order of these two steps. So the increment actually is executed first and the assignment later in any case.

Resources