Confusion over pointer index operator - c

I am a little bit confused about pointer index operator in C. I will try to explain my question with an example:
int array[5] = {1,2,3,4,5};
int *p;
p = array;
p[2]++;
In the fourth line, I know that it increments the second index of array. However, when I see an index operator, I convert it.
For instance, I converted p[2]++ to *(p+2)++. According to the operator precedence table, in the statement of *(p+2)++, the increment and dereferencing operators have the same precedence, but increment takes precedence due to right associativity. Therefore, it becomes *(p+3). Then, this statement cannot change any value and just points third index of array.
Why does p[2]++ increment the second index of the array? What is wrong in my perspective?

p[2]++ is equivalent to (*(p+2))++, not *(p+2)++. You need an extra set of parentheses to maintain the precedence from the original expression.
Without them you've got *(p+2)++ which, as you've noted, is equivalent to *((p+2)++). This has a different meaning from the original expression since it splits up the +2 and the *. They need to be done in the same step since [2] is a single atomic operation.

As already commented p[2]++ can be converted to (*(p+2))++, because p[2] is the element you want to increment.
Think that when incrementing indexes is usually done like p[i++]

In math, if a=b+c,then a*d is not b+c*d but (b+c)*d. Likely, p[2] is not be taken by *(p+2) but (*(p+2)) to avoid any change on precedence.

Related

Difference between ++*p++ and *++p++?

What is the difference between ++*p++ and *++p++ (where p is a pointer) in C?
I keep getting an error when I do the second one; can somebody explain, as it's been in my head for days. I checked on Quora and other websites but I couldn't find anything useful.
I want to know why the first one is acceptable but not the latter.
#include <stdio.h>
int main()
{
int arr[]={3,9,0,4,5};
int *ptr=arr;
printf("%d ",*++ptr++);
printf("%d ",*ptr);
return 0;
}
The issue here is a question of operator precedence and the nature of the result of the postfix increment operator (i.e. p++).
In C, that postfix increment operator has the highest priority of all; then, the prefix increment and indirection (*) operator have equal priority, and have right-to-left associativity.
So, adding parentheses to your expressions, to clarify the order of evaluation, we get the following:
*++ptr++ becomes *( ++(p++) )
++*ptr++ becomes ++( *(p++) )
Now remember that the result of the postfix operation is a so-called "rvalue" § – that is, something that can be used on the right-hand side of an assignment but not on the left-hand side. (For example, the constant, 3 is an rvalue: x = 3 is a valid operation but 3 = x is clearly not.)
We can now see that, in your first expression, inside the outer brackets that I have added, we are trying to increment the result of the p++ operation – and that is not allowed. However, in the second case, we are only dereferencing that result (which is a pointer) and then (outside the outer brackets) incrementing the pointed-to variable – which is allowed.
When I compile your code with clang-cl, the error is:
error : expression is not assignable
As (hopefully) explained above, the "expression" referred to is p++.
§ Formally, the result of the postfix increment operator is (a copy of) the value of its operand; that value is not modifiable or assignable.
This is definitely a strange and surprising result. To understand what's going on, it will help to take a closer result at what the "autoincrement" operators ++x and x++ really do.
Most expressions simply compute new values. If I say
a = b * 3;
that means, "take the value of the variable b, multiply it by 3, and that's the value we'll assign to a". Similarly, if I say
a = a + 1;
that means, "take the (old) value of the variable a, add 1 to it, and that's the new value we'll assign back to a".
But ++ is special, because it has the "assign the value back" part built in. Any time you use ++ (or --), two things are happening: we're computing a new value, but we're also modifying the variable whose value we just fetched.
To make this very clear, if I say
a = ++b;
that means, "take the value of the variable b, add 1 to it, assign that new value back to b, and that's the value we'll assign to a". That's for the "prefix" form ++b. For b++, it's a little different:
a = b++;
That means, "take the value of the variable b, add 1 to it, assign that new value back to b, but the value we'll assign to a is the old value of b, before we added 1 to it." In other words, the value of the subexpression b++, the value that "pops out" to participate in the larger expression, is the old value of b.
The other thing to keep in mind here is that when it comes to assigning values, we obviously need a variable to assign the value to. We can't say
3 = b * 3; /* WRONG */
On the right-hand side of the = sign, we fetch b's value and multiply it by 3, but then where we store the new value? On the left-hand side of the = sign, 3 is not the name of a variable, nor is it any kind of a location where we can store a value. So an assignment like this is illegal.
(Formally, what we've been talking about here is the difference between an rvalue and an lvalue. Those are interesting and useful terms that you might want to learn about some day, perhaps even today, but I'm not going to say anything more about them for now.)
But now we have almost enough information to answer your original question. Let's look at the expression that worked:
++*ptr++
What the heck does that mean?
In one sense, it's kind of meaningless, because it's not something that you would probably ever write in a real program. It has very little practical value, which is actually kind of good, which means it's not so bad that, at first glance, it's pretty badly cryptic, in that it's not obvious what it should do.
To understand what it does, we have to be clear about the precedence. Which operands bind more tightly to their operands? Precedence is what tells us that if we write
1 + 2 * 3
the multiplication operator * binds more tightly, meaning that the expression is evaluated as if we had written
1 + (2 * 3)
Now, it happens that the autoincrement operator ++ binds more tightly that the unary contents-of operator *. That is, when we write
++*ptr++
the expression is evaluated as if we had written
++ *(ptr++)
So the first thing we're going to do, inside the parentheses, is ptr++. This means, as we saw before, "take the value of the variable ptr, add 1 to it, assign that new value back to ptr, but the value that pops out to the larger expression is the old value of ptr, before we added 1 to it."
And then the next thing that happens in the "larger expression" is the * or contents-of operator. * works on a pointer, and accesses the object pointed to by the pointer. In your original program, the object pointed to by the pointer ptr was the first cell of the array arr, that is, arr[1]. So * is going to operate on whatever the old value of ptr was, whatever ptr used to point to. And that's arr[0]. So what we end up doing is the equivalent of
++(arr[0])
We're going to take arr[0]'s old value, add 1 to it, store it back in arr[0], and (since this is prefix ++ we're talking about), the value that will "pop out" to the larger expression would be the new value of arr[0]. So, if you had written
printf("%d\n", ++*ptr++);
it would have printed the new value of arr[0], or 4.
The bottom line is that although ++*ptr++ is a complicated-looking expression that's hard to understand and does something so obscure that it might not even be useful, it does do something, and is legal.
So now, finally, it's time to look at
*++ptr++
What does that do?
The first thing we have to know is whether prefix ++ or postfix ++ binds more tightly. It's a question that hardly ever comes up (and we're about to see why), but the answer is that postfix ++ binds more tightly. So this expression is interpreted as if you had written
* ++(ptr++)
So, once again, the first thing we're going to do is take ptr's value, add 1 to it, store that new value back into ptr, and then the value that's going to "pop out" to the larger expression is going to be the old value — but only the old value — of ptr.
Let me say that again. The value that "pops out" to the larger expression is just the old value of ptr. By that time we no longer know or care that it was the variable ptr that we got this value from.
So then we come to the prefix ++. And now we have a serious problem. Remember, ++ wants to fetch a value from an object, add 1 to it, and store the new value back into an object. But at this point we don't have an object to fetch from or store to, we just have a value — remember, the old value of the variable ptr.
This will be easier to understand if we think about integer variables, instead of pointer-to-integer. Suppose I said
int a;
int b = 5;
a = ++(b++);
So we fetch b's value, which is 5, and add 1 to it, and store the new value — which is 6 — back in b, and the value that "pops out" to the larger expression s the old value of b. So now it's as if we had written
a = ++5; /* WRONG */
And this makes no sense. We can't "fetch the old value from the variable 5", because 5 isn't a variable. It's just as wrong as when we said 3 = b * 3;, and for the same reason.
You might also be interested in Question 4.3 in the C FAQ list.

What happens when you dereference a postincrement C

I am receiving a lot of conflicting answers about this. But as I always understood it.
When we have a pointer in C and use it in a post increment statement, the post increment will always happen after the line of code resolves.
int array[6] = {0,1,2,3,4,5};
int* p = array;
printf("%d", *p++); // This will output 0 then increment pointer to 1
output :
0
Very simple stuff. Now here's where I am receiving a bit of dissonance in the information people are telling me and my own experience.
// Same code as Before
int array[0] = {0,1,2,3,4,5};
int* p = array;
printf("%d", *(p++)); // Issue with this line
output :
0
Now when I run that second version of the code The result is that it will output 0 THEN increments the pointer. The order of operations implied by the parentheses seems to be violated. However some other answers on this site tell me that the proper thing that should happen is that the increment should happen before the dereference. So I guess my question is this: Is my understanding correct? Do post increment statements always execute at the end of the line?
Additional Info:
I am compiling with gcc on linux mint with gcc version ubuntu 4.8.4
I have also tested this on gcc on debian with version debian 4.7.2
OP's "The result is that it will output 0 THEN increments the pointer." is not correct.
The postfix increment returns the value of the pointer. Consider this value as a copy of the original pointer's value. The pointer is incremented which does not affect the copy.
The result of the postfix ++ operator is the value of the operand. As a side effect, the value of the operand object is incremented. ... C11dr 6.5.2.4 2
Then the copy of the pointer is de-referenced and returns the 0. That is the functional sequence of events.
Since the side-effect of incrementing the pointer and de-referencing that copy of the pointer do not effect each other, which one occurs first is irrelevant. The compiler may optimized as it likes.
"the end of the line" is not involved in code. It is the end of the expression that is important.
There is no difference in meaning between *p++ and *(p++).
This is because postfix operators have a higher precedence than unary operators.
Both these expressions mean "p is incremented, and its previous value is dereferenced".
If you want to increment the object being referenced by the pointer, then you need to override precedence by writing (*p)++.
No version of your code can produce output and then increment p. The reason is that p is incremented in an argument expression which produces a value that is passed into printf. In C, a sequence point occurs just before a function is called. So the new value of p must settle in place before printf executes. And the output cannot take place until printf is called.
Now, you have to take the above with a slight grain of salt. Since p is a local variable, modifying it isn't an externally visible effect. If the new value of p isn't used anywhere, the increment can be entirely optimized away. But suppose we had an int * volatile p; at file scope, and used that instead. Then the expression printf("...", *p++) has to increment p before printf is called.
The expression p++ has a result (the value of p before the increment) and a side effect (the value of p is updated to point to the next object of type int).
Postfix ++ has higher precedence than unary *, so *p++ is already parsed as *(p++); you will see no difference in behavior between those two forms. IOW, the dereference operator is applied to the result of p++; the line
printf("%d", *p++);
is roughly equivalent to
printf("%d", *p);
p++;
with the caveat that p will actually be updated before the call to printf1.
However, (*p)++ will be different; instead of incrementing the pointer, you are incrementing the thing p points to.
1. The side effect of a ++ or -- operator must be applied before the next sequence point, which in this particular case occurs between the time the function arguments are evaluated and the function itself is called.
Here is my take on this. Let's ignore the printf function altogether and make things simpler.
If we said
int i;
int p=0;
i = p++;
Then i would be equal to zero because p was equal to zero but now p has been incremented by one; so now i still equals zero and p is equal to 1.
Ignoring the declarations of i and p as integers, if we wrap this as in the example, i = *(p++), then the same action occurs but i now contains the value pointed at by p which had the value of zero. However, the value of p, now, has been incremented by one.

Is this a undefined behaviour or normal output

This is a very simple question but even have some doubt in sequence point.
int a[3] = {1,2,4};
printf("%d",++a[1]);
o/p
3
Is this a valid c statement, I am getting output 3, which means it is same as
++(a[1])
But how is this possible as we have a pre-increment operator which has to increment the a first then the dereference has to happen.
Please correct my doubt. How we are getting 3?
Behavior is well defined. Operator [] has higher precedence than prefix ++ operator. Therefore operand a will bind to []. It will be interpreted as
printf("%d", ++(a[1]));
Your parentheses are right, your rationale for what you think should happen obviously wrong.
If you were right, and prefix-increment had higher priority than indexing, you would get a compiler-error for ill-formed code, trying to increment an array.
As-is, there's absolutely no chance for sequencing-errors or the like leading to UB.
That's how pre increment operator works. Its similar to ++count. So here your value at a[1] (as [] has higher precendence than ++) get incremented and then its printed onto the console.
As you can see here: http://en.cppreference.com/w/cpp/language/operator_precedence
The operator [] has a higher precedence then ++

Confusing answers : One says *myptr++ increments pointer first,other says *p++ dereferences old pointer value

I would appreciate if you clarify this for me.Here are two recent questions with their accepted answers:
1) What is the difference between *myptr++ and *(myptr++) in C
2) Yet another sequence point query: how does *p++ = getchar() work?
The accepted answer for the first question,concise and easily to understand states that since ++ has higher precedence than *, the increment to the pointer myptr is done first and then it is dereferenced.I even checked that out on the compiler and verified it.
But the accepted answer to the second question posted minutes before has left me confused.
It says in clear terms that in *p++ strictly the old address of p is dereferenced. I have little reason to question the correctness of a top-rated answer of the second question, but frankly I feel it contradicts the first question's answer by user H2CO3.So can anyone explain in plain and simple English what the second question's answer mean and how come *p++ dereferences the old value of p in the second question.Isn't p supposed to be incremented first as ++ has higher precedence?How on earth can the older address be dereferenced in *p++Thanks.
The postfix increment operator does have higher precedence than the dereference operator, but postfix increment on a variable returns the value of that variable prior to incrementing.
*myptr++
Thus the increment operation has higher precedence, but the dereferencing is done on the value returned by the increment, which is the previous value of myptr.
The answer in the first question you've linked to is not wrong, he's answering a different question.
There is no difference between *myptr++ and *(myptr++) because in both cases the increment is done first, and then the previous value of myptr is dereferenced.
The accepted answer for the first question,concise and easily to understand states that since ++ has higher precedence than *,
Right. That is correct.
the increment to the pointer myptr is done first and then it is dereferenced.
It doesn't say that. Precedence determines the grouping of the subexpressions, but not the order of evaluation.
That the precedence of ++ is higher than the precedence of the indirection * says that
*myptr++
is exactly the same (not on the cource code level, of course) as
*(myptr++)
and that means that the indirection is applied to the result of the
myptr++
subexpression, the old value of myptr, whereas (*myptr)++ would apply the increment operator to what myptr points to.
The result of a postfix increment is the old value of the operand, so
*myptr++ = something;
has the same effect as
*myptr = something;
myptr++;
When the side-effect of storing the incremented value of myptr happens is unspecified. It may happen before the indirection is evaluated, or after that, that is up to the compiler.
Section 6.5.2.4 of the C specification discusses the postfix increment and decrement operators. And the second paragraph there pretty much answers your question:
The result of the postfix ++ operator is the value of the operand. As a side effect, the
value of the operand object is incremented (that is, the value 1 of the appropriate type is
added to it).
...
The value computation of the result is sequenced before the side effect of
updating the stored value of the operand.
So given *myptr++, yes it's true the the ++ part has higher precedence; but precedence does not exclusively determine your result. The language defines that with the specs. In this case the value of myptr is returned, then the "side effect" of myptr being incremented is executed.

(p++)->x Why are the parentheses unnecessary? (K&R)

From page 123 of The C Programming Language by K&R:
(p++)->x increments p after accessing x. (This last set of parentheses is unnecessary. Why?)
Why is it unnecessary considering that -> binds stronger than ++?
EDIT: Contrast the given expression with ++p->x, the latter is evaluated as ++(p->x) which would increment x, not p. So in this case parentheses are necessary and we must write (++p)->x if we want to increment p.
The only other possible interpretation is:
p++(->x)
and that doesn't mean anything. It's not even valid. The only possible way to interpret this in a valid way is (p++)->x.
Exactly because -> binds stronger than ++. (it doesn't, thanks #KerrekSB.)
increments p after accessing x.
So first you access x of p, then you increment p. That perfectly matches the order of evaluation of the -> and the + operators.
Edit: aww, these edit's...
So what happens when you write ++p->x is that it could be interpreted either as ++(p->x) or as (++p)->x (which one is actually chosen is just a matter of language design, K&R thought it would be a good idea to make it evaluate as in the first case). The thing is that this ambiguity doesn't exist in the case of p++->x, since it can only be interpreted as (p++)->x. The other alternatives, p(++->x), p(++->)x and p++(->x) are really just syntactically malformed "expressions".
The maximal munch strategy says that p++->x is divided into the following preprocessing tokens:
p then ++ then -> then x
In p++->x expression there are two operators, the postfix ++ operator and the postifx -> operator. Both operators being postfix operators, they have the same precedence and there is no ambiguity in parsing the expression. p++->x is equivalent to (p++)->x.
For ++p->x expression, the situation is different.
In ++p->x, the ++ is not a postfix operator, it is the ++ unary operator. C gives postfix operators higher precedence over all unary operators and this is why ++p->x is actually equivalent to ++(p->x).
EDIT: I changed the first part of the answer as a result of Steve's comment.
Both post-increment and member access operator are postfix expressions and bind the same. Considering that they apply to the primary or postfix expression to the left, there can't be ambiguity.
In
p++->x
The postfix-++ operator can apply only to the expression to the left of it (i.e. to p).
Similarly ->x can only be an access to the expression to its left, which is p++. Writing that expression as (p++) is not needed, but also does no harm.
The "after" in your description of the effects, does not express temporal order of increment and member access. It only expresses that the result of p++ is the value p had before the increment and that that value is the value used for the member access.
The expresion p++ results in a pointer with the value of p. Later on, the ++ part is performed, but for the purposes of interpreting the expression, it may just as well not be there. ->x makes the compiler add the offset for the member x to the original address in p and access that value.
If you change the statement to :
p->x; p++;
it would do exactly the same thing.
The order of precedence is actually exactly the same, as can be seen here - but it doesn't really matter.

Resources