My question is about the following line of code, taken from "The C Programming Language" 2nd Edition:
*p++->str;
The book says that this line of code increments p after accessing whatever str points to.
My understanding is as follows:
Precedence and associativity say that the order in which the operators will be evaluated is
->
++
*
The postfix increment operator ++ yields a value (i.e. value of its operand), and has the side effect of incrementing this operand before the next sequence point (i.e. the following ;)
Precedence and associativity describe the order in which operators are evaluated and not the order in which the operands of the operators are evaluated.
My Question:
My question is around the evaluation of the highest precedence operator (->) in this expression. I believe that to evaluate this operator means to evaluate both of the operands, and then apply the operator.
From the perspective of the -> operator, is the left operand p or p++? I understand that both return the same value.
However, if the first option is correct, I would ask "how is it possible for the evaluation of the -> operator to ignore the presence of the ++".
If the second option is correct, I would ask "doesn't the evaluation of -> in this case then require the evaluation of a lower precedence operator ++ here (and the evaluation of ++ completes before that of ->)"?
To understand the expression *p++->str you need to understand how *p++ works, or in general how postfix increment works on pointers.
In case of *p++, the value at the location p points to is dereferenced before the increment of the pointer p.
n1570 - §6.5.2.4/2:
The result of the postfix ++ operator is the value of the operand. As a side effect, the value of the operand object is incremented (that is, the value 1 of the appropriate type is added to it). [...]. The value computation of the result is sequenced before the side effect of updating the stored value of the operand.
In case of *p++->str, ++ and -> have equal precedence and higher than * operator. This expression will be parenthesised as *((p++)->str) as per the operator precedence and associativity rule.
One important note here is precedence and associativity has nothing to do with the order of evaluation. So, though ++ has higher precedence it is not guaranteed that p++ will be evaluated first. Which means the expression p++ (in the expression *p++->str) will be evaluated as per the rule quoted above from the standard. (p++)->str will access the str member p points to and then it's value is dereferenced and then the value of p is incremented any time between the last and next sequence point.
Postfix ++ and -> have the same precedence. a++->b parses as (a++)->b, i.e. ++ is done first.
*p++->str; executes as follows:
The expression parses as *((p++)->str). -> is a meta-postfix operator, i.e. ->foo is a postfix operator for all identifiers foo. Postfix operators have the highest precedence, followed by prefix operators (such as *). Associativity doesn't really apply: There is only one operand and only one way to "associate" it with a given operator.
p++ is evaluated. This yields the (old) value of p and schedules an update, setting p to p+1, which will happen at some point before the next sequence point. Call the result of this expression tmp0.
tmp0->str is evaluated. This is equivalent to (*tmp0).str: It dereferences tmp0, which must be a pointer to a struct or union, and gets the str member. Call the result of this expression tmp1.
*tmp1 is evaluated. This dereferences tmp1, which must be a pointer (to a complete type). Call the result of this expression tmp2.
tmp2 is ignored (the expression is in void context). We reach ; and p must have been incremented before this point.
Related
Here is another naïve question from a C newbie: on this page, https://en.cppreference.com/w/c/language/operator_precedence, the precedence of the postfix increment is listed to be higher than that of pointer dereference. So I was expecting in the following code that the pointer is incremented first (pointing at 10) and then dereferenced.
#include <stdio.h>
int main()
{
int a[] = {3, 10, 200};
int *p = a;
printf("%d", *p++);
return 0;
}
However this code outputs still the first array item (3). What am I missing by the concept?
Precedence is placing of parenthesis.
The expression *p++ can be parenthesized as
(*p)++ // incorrect precedence
*(p++) // correct precedence
Note that the value of p++ is the value of p before any change, so the net effect of the correct precedence is the same as *p without ant reflection over the side-effect ++. The change to p itself does not alter the result of *(p++).
As you have correctly assumed, the expression *p++ is evaluated as *(p++); that is, the ++ operator has higher precedence than the * operator.
However, the value of the expression, p++, is just the value of p (i.e. its value before the increment). A side-effect of the operation is that the value of p is incremented after its value has been acquired.
From this Draft C11 Standard:
6.5.2.4 Postfix increment and decrement operators
…
2 The result of the postfix ++ operator is the
value of the operand. As a side effect, the value of the operand
object is incremented (that is, the value 1 of the appropriate type is
added to it). … The value computation of the result is sequenced
before the side effect of updating the stored value of the operand. With
respect to an indeterminately-sequenced function call, the operation of
postfix ++ is a single evaluation. …
Operator precedence specifies how an expression is parsed. Since postfix ++ has higher precedence than *, the expression is equivalent to *(p++). Rather than (*p)++ which would have given it a completely different meaning.
But just because this forces p++ to be evaluated first, it doesn't affect the characteristic of the ++ operator. The C language specifies this operator to behave as (from C17 6.5.2.4/2):
"The value computation of the result is sequenced before the side effect of
updating the stored value of the operand."
This means that p++ always gives the value of p before ++ is applied. In this case p is a pointer, so the value will be the address it pointed at prior this expression. So the code is completely equivalent to this:
int* tmp = p;
p++;
printf("%d", *tmp);
Precedence controls which operators are grouped with which operands. Postfix ++ having higher precedence than unary * simply means that *p++ is parsed as *(p++) instead of (*p)++.
*(p++) means you are dereferencing the result of p++. The result of p++ is the current value of p. As a side effect p is incremented. It is logically equivalent to
tmp = p;
printf( "%d\n", *tmp );
p = p + 1;
where the printf call and the update to p can happen in any order, even simultaneously (interleaved or in parallel).
I found this text (source: https://education.cppinstitute.org/) and I'm trying to understand the second instruction.
Can you answer the question of what distinguishes these two instructions?
c = *p++;
and
c = (*p)++;
We can explain: the first assignment is as if the following two disjoint instructions have been performed;
c = *p;
p++;
In other words, the character pointed to by p is copied to the c variable; then, p is increased and points to the next element of the array.
The second assignment is performed as follows:
c = *p;
string[1]++;
The p pointer is not changed and still points to the second element of the array, and only this element is increased by 1.
What I don't understand is why it is not incremented when the = operator has less priority than the ++ operator.
With respect to this statement expression
c = (*p)++;
, you say
What i dont understand is why [p] is not incremented when the =
operator has less priority than the ++ operator.
There is a very simple explanation: p is not incremented as a result of evaluating that expression because it is not the operand of the ++ operator.
That is in part exactly because the = operator has lower precedence: because the precedence of = is so low, the operand of ++ is the expression (*p) rather than the expression c = (*p). Note in particular that p itself is not even plausibly in the running to be the operand in that case, unlike in the variation without parentheses.
Moving on, the expression (*p) designates the thing to which p points, just as *p all alone would do. Context suggests that at that time, that's the same thing designated by string[1]. That is what gets incremented, just as the text says, and its value prior to the increment is the result of the postfix ++ operation.
What I don't understand is why it is not incremented when the = operator has less priority than the ++ operator.
The value for example of the expression
x++
is the value of x before incrementing.
So if you'll write
y = x++;
then the variable y gets the value of x before its incrementing.
From the C Standard (6.5.2.4 Postfix increment and decrement operators)
2 The result of the postfix ++ operator is the value of the operand.
As a side effect, the value of the operand object is incremented (that
is, the value 1 of the appropriate type is added to it). ... The
value computation of the result is sequenced before the side effect of
updating the stored value of the operand. ...
If instead of the expression
c = (*p)++;
you'll write
c = ++(*p);
then you get the expected by you result. This demonstrates the difference between the postfix increment operator ++ and the prefix (unary) increment operator ++.
When the ++ is following a variable, the variable is incremented after it has been used.
So when you have
y = x++;
x is incremented after y gets the value of x.
This is how it works for the -- operator also.
running this code:
#include <stdio.h>
int main() {
int x[]={20,30};
int *p=x;
++*p++;
printf("%d %d\n",x[0],*p);
return 0;
}
the output is 21 30 which is something that doesn't make sense to me because according to C operator precedence the postfix increment comes first though if that was the case in my opinion the output should be 20 31.For the record i am new to programming and it really seems that i cant get the hang of it so sorry if this question is stupid :)
From the C++ Standard (the same is valid for the C Standard)
5.2 Postfix expressions
1 Postfix expressions group left-to-right.
Postfix expressions and p++ is a postfix expression have higher priority than unary expressions.
The C++ Standard
5.3 Unary expressions
1 Expressions with unary operators group right-to-left.
In this expression ++*p there are two unary subexpressions: *p and ++( *p )
So the whole expression can be written like
++( *( p++ ) );
Take into account regarding the postfix expression ++ that (now it is the C Standard)
6.5.2.4 Postfix increment and decrement operators
2 The result of the postfix ++ operator is the value of the operand.
As a side effect, the value of the operand object is incremented (that
is, the value 1 of the appropriate type is added to it).
Let's consider the result of the expression statement
++( *( p++ ) );
subexpression p++ has the value of its operand that is the address of type int * of the first element of the array. Then due to the dereferencing the expression *( p++ ) yields the lvalue of the first element of the array that is x[0] and then its value is increased. So the first element of the arry now has the value 21.
At the same time the postfix increment incremented the pointer p as its side effect (see the quote above from the C Standard). Its now points to the second element of the array.
Thus the output will be
21 30
You first increment where p points to and then you advance the pointer by one.
So, p points to 20, thus ++20 = 21.
Then the pointer will be increased by once, and due to pointer's arithmetic, it will point to next element of 20, which is 30, in your array.
As M.M said, you are confusing, the order of evaluation with precedence. Read more about it here.
according to C operator precedence the postfix increment comes first
Precedence is not the same thing as order of evaluation.
Precedence controls which operators are grouped with which operands. In this case, the expression ++*p++; is parsed as ++(*(p++)).
The order of evaluation is
Evaluate p++; the result of this evaluation is &x[0], and the side effect is to advance p to point to x[1];
Dereference the result of 1; the result of this evaluation is x[0];
Apply the prefix ++ operator to the result of 2; the result of this evaluation is x[0] + 1, with the side effect that the value stored in x[0] is incremented.
Remember that side effects do not have to be applied immediately upon evaluation; they may be deferred until a sequence point.
Why is the postfix increment operator (++) executed after the assignment (=) operator in the following example? According to the precedence/priority lists for operators ++ has higher priority than = and should therefore be executed first.
int a,b;
b = 2;
a = b++;
printf("%d\n",a);
will output a = 2.
PS: I know the difference between ++b and b++ in principle, but just looking at the operator priorities these precende list tells us something different, namely that ++ should be executed before =
++ is evaluated first. It is post-increment, meaning it evaluates to the value stored and then increments. Any operator on the right side of an assignment expression (except for the comma operator) is evaluated before the assignment itself.
It is. It's just that, conceptually at least, ++ happens after the entire expression a = b++ (which is an expression with value a) is evaluated.
Operator precedence and order of evaluation of operands are rather advanced topics in C, because there exists many operators that have their own special cases specified.
Postfix ++ is one such special case, specified by the standard in the following manner (6.5.2.4):
The value computation of the result is sequenced before the side
effect of updating the stored value of the operand.
It means that the compiler will translate the line a = b++; into something like this:
Read the value of b into a CPU register. ("value computation of the result")
Increase b. ("updating the stored value")
Store the CPU register value in a.
This is what makes postfix ++ different from prefix ++.
The increment operators do two things: add +1 to a number and return a value. The difference between post-increment and pre-increment is the order of these two steps. So the increment actually is executed first and the assignment later in any case.
From page 123 of The C Programming Language by K&R:
(p++)->x increments p after accessing x. (This last set of parentheses is unnecessary. Why?)
Why is it unnecessary considering that -> binds stronger than ++?
EDIT: Contrast the given expression with ++p->x, the latter is evaluated as ++(p->x) which would increment x, not p. So in this case parentheses are necessary and we must write (++p)->x if we want to increment p.
The only other possible interpretation is:
p++(->x)
and that doesn't mean anything. It's not even valid. The only possible way to interpret this in a valid way is (p++)->x.
Exactly because -> binds stronger than ++. (it doesn't, thanks #KerrekSB.)
increments p after accessing x.
So first you access x of p, then you increment p. That perfectly matches the order of evaluation of the -> and the + operators.
Edit: aww, these edit's...
So what happens when you write ++p->x is that it could be interpreted either as ++(p->x) or as (++p)->x (which one is actually chosen is just a matter of language design, K&R thought it would be a good idea to make it evaluate as in the first case). The thing is that this ambiguity doesn't exist in the case of p++->x, since it can only be interpreted as (p++)->x. The other alternatives, p(++->x), p(++->)x and p++(->x) are really just syntactically malformed "expressions".
The maximal munch strategy says that p++->x is divided into the following preprocessing tokens:
p then ++ then -> then x
In p++->x expression there are two operators, the postfix ++ operator and the postifx -> operator. Both operators being postfix operators, they have the same precedence and there is no ambiguity in parsing the expression. p++->x is equivalent to (p++)->x.
For ++p->x expression, the situation is different.
In ++p->x, the ++ is not a postfix operator, it is the ++ unary operator. C gives postfix operators higher precedence over all unary operators and this is why ++p->x is actually equivalent to ++(p->x).
EDIT: I changed the first part of the answer as a result of Steve's comment.
Both post-increment and member access operator are postfix expressions and bind the same. Considering that they apply to the primary or postfix expression to the left, there can't be ambiguity.
In
p++->x
The postfix-++ operator can apply only to the expression to the left of it (i.e. to p).
Similarly ->x can only be an access to the expression to its left, which is p++. Writing that expression as (p++) is not needed, but also does no harm.
The "after" in your description of the effects, does not express temporal order of increment and member access. It only expresses that the result of p++ is the value p had before the increment and that that value is the value used for the member access.
The expresion p++ results in a pointer with the value of p. Later on, the ++ part is performed, but for the purposes of interpreting the expression, it may just as well not be there. ->x makes the compiler add the offset for the member x to the original address in p and access that value.
If you change the statement to :
p->x; p++;
it would do exactly the same thing.
The order of precedence is actually exactly the same, as can be seen here - but it doesn't really matter.