Operator precedence in the given expressions - c

Expression 1: *p++; where p is a pointer to integer.
p will be incremented first and then the value to which it is pointing to is taken due to associativity(right to left). Is it right?
Expression 2: a=*p++; where p is a pointer to integer.
Value of p is taken first and then assigned to a first then p is incremented due to post increment. Is it right?

First of all, let me tell you that, neither associativity nor order of evaluation is actually relevant here. It is all about the operator precedence. Let's see the definitions first. (emphasis mine)
Precedence : In mathematics and computer programming, the order of operations (or operator precedence) is a collection of rules that reflect conventions about which procedures to perform first in order to evaluate a given mathematical expression.
Associativity: In programming languages, the associativity (or fixity) of an operator is a property that determines how operators of the same precedence are grouped in the absence of parentheses.
Order of evaluation : Order of evaluation of the operands of any C operator, including the order of evaluation of function arguments in a function-call expression, and the order of evaluation of the subexpressions within any expression is unspecified, except a few cases. There's mainly two types of evaluation: a) value computation b) side effect.
Post-increment has higher precedence, so it will be evaluated first.
Now, it so happens that the value increment is a side effect of the operation which is sequenced after the " value computation". So, the value computation result, will be the unchanged value of the operand p (which again, here, gets dereferenced due to use of * operator) and then, the increment takes place.
Quoting C11, chapter §6.5.2.4,
The result of the postfix ++ operator is the value of the operand. As a side effect, the
value of the operand object is incremented (that is, the value 1 of the appropriate type is
added to it). See the discussions of additive operators and compound assignment for
information on constraints, types, and conversions and the effects of operations on
pointers. The value computation of the result is sequenced before the side effect of
updating the stored value of the operand. [.....]
The order of evaluation in both the cases are same, the only difference is, in the first case, the final value is discarded.
If you use the first expression "as-is", your compiler should produce a warning about unused value.

Postfix operators have higher priorities than unary operators.
Thus this expression
*p++
is equivalent to the expression
*( p++ )
According to the C Standard (6.5.2.4 Postfix increment and decrement operators)
2 The result of the postfix ++ operator is the value of the
operand. As a side effect, the value of the operand object is
incremented (that is, the value 1 of the appropriate type is added to
it). See the discussions of additive operators and compound assignment
for information on constraints, types, and conversions and the effects
of operations on pointers. The value computation of the result is
sequenced before the side effect of updating the stored value of the
operand.
So p++ yields the original value of the pointer p as the result of the operation and has also a side effect of incrementing the operand itself.
As for the unary operator then (6.5.3.2 Address and indirection operators)
4 The unary * operator denotes indirection. If the operand points to a
function, the result is a function designator; if it points to an
object, the result is an lvalue designating the object. If the operand
has type ‘‘pointer to type’’, the result has type ‘‘type’’. If an
invalid value has been assigned to the pointer, the behavior of the
unary * operator is undefined
So the final result of the expression
*( p++ )
is the value of the object pointed to by the pointer p that also is incremented due to the side effect. This value is assigned to the variable a in the statement
a=*p++;
For example if there are the following declarations
char s[] = "Hello";
char *p = s;
char a;
then after this statement
a = *p++;
the object a will have the character 'H' and the pointer p will point to the second character of the array s that is to the character 'e'.

Associativity is not relevant here. Associativity only matters when you have adjacent operators with the same precedence. But in this case, ++ has higher precedence than *, so only precedence matters. Because of precedence, the expression is equivalent to:
*(p++)
Since it uses post-increment, p++ increments the pointer, but the expression returns the value of the pointer before it was incremented. The indirection then uses that original pointer to fetch the value. It's effectively equivalent to:
int *temp = p;
p = p + 1;
*temp;
The second expression is the same, except it assigns the value to another variable, so that last statement becomes:
a = *temp;

The expression
*p++
is equivalent to
*(p++)
This is due to precedende (i.e.: the postfix increment operator has higher precedence than the indirection operator)
and the expression
a=*p++
is for the same reason equivalent to
a=*(p++)
In both cases, the expression p++ is evaluated to p.

v = i++;: i is returned to the equality operation and then assigned to v. Subsequently, i is incremented (EDIT: technically it's not necessarily executed in this order). Thus v has the old value of i. I remember it like this: ++ is written last and therefore happens last.
v = ++i;: i is incremented, and then returned to be assigned to v. v and i has the same value.
When you don't use the returned value, they do the same (although different implementations may yield different performance in some cases). E.g. in for loops, for(int i=0; i<n; i++) is the same as for(int i=0; i<n; ++i). The latter is sometimes automatically preferred because it tends to be faster for some objects.
* has lower precedence than ++ so *p++ is the same as *(p++). Thus in this case p is returned to * which dereferences it. Then the address in p is incremented by one element. *++p increments the adress of p first, then dereferences it.
v = (*p)++; sets v equal to the old value pointed to by p and then increments it, while v = ++(*p); increments the value pointed to by p and then sets v equal to it. The address in p is unchanged.
Example: If,
int a[] = {1,2};
then
int v = *a++;
and
int v = *++a;
will both leave a incremented, but in the first case v will be 1 and in the latter it'll be 2.

*p++; where p is a pointer to integer.
p will be incremented first and then the value to which it is pointing to is taken due to associativity (right to left). Is it right?
No. In a post-increment, the value is copied to a temporary (an rvalue), then the lvalue is incremented as a side effect.
a=*p++; where p is a pointer to integer.
Value of p is taken first and then assigned to a first then p is incremented due to post increment. Is it right?
No, that's not correct either. The increment of p might happen before the write to a. What's important is that the value being stored in a was loaded using the temporary copy of the prior value of p.
Whether that memory fetch occurs before the memory write with the new value of p isn't specified, and any code that relies on the order is undefined behavior.
Any of these sequences are allowed:
Copy p into temporary THEN increment p, THEN load value at address indicated in temporary THEN store loaded value to a
Copy p into temporary THEN load value at address indicated in temporary (this value itself is placed in a temporary) THEN increment p THEN store loaded value to a
Copy p into temporary THEN load value at address indicated in temporary THEN store loaded value to a THEN increment p
Here are two code examples that are undefined behavior because they rely on the order of side effects:
int a = 7;
int *p = &a;
a = (*p)++; // undefined behavior, do not do this!!
void *pv;
pv = &pv;
void *pv2;
pv2 = *(pv++); // undefined behavior, do not do this!!!
The parentheses do not create a sequence point (or sequenced before relationship, in the new wording). The version of the code with parentheses is just as undefined as the version without.

Related

Using pointed to content in assignment of a pointer

It has always been my understanding that the lack of a sequence point after the reading of the right expression in an assignment makes an example like the following produce undefined behavior:
void f(void)
{
int *p;
/*...*/
p = (int [2]){*p};
/*...*/
}
// p is assigned the address of the first element of an array of two ints, the
// first having the value previously pointed to by p and the second, zero. The
// expressions in this compound literal need not be constant. The unnamed object
// has automatic storage duration.
However, this is EXAMPLE 2 under "6.5.2.5 Compound literals" in the committee draft for the C11 standard, the version identified as n1570, which I understand to be the final draft (I don't have access to the final version).
So, my question: Is there something in the standard that gives this defined and specified behavior?
EDIT
I would like to expound on exactly what I see as the problem, in response to some of the discussion that has come up.
We have two conditions under which an assignment is explicitly stated to have
undefined behavior, as per 6.5p2 of the standard quoted in the answer given by dbush:
1) A side effect on a scalar object is unsequenced relative to a different side
effect on the same scalar object.
2) A side effect on a scalar object is unsequenced relative to a value
computation using the value of the same scalar object.
An example of item 1 is "i = ++i + 1". In this case the side effect of
writing the value i+1 into i due to ++i is unsequenced relative to the side effect of assigning the RHS to the LHS. There is a sequence point between the value calculations of each side and the assignment of RHS to LHS, as described in 6.5.16.1 given in the answer by Jens Gustedt below. However, the modification of i due to ++i is not subject to that sequence point, otherwise the behavior would
be defined.
In the example I give above, we have a similar situation. There is a value computation, which involves the creation of an array and the conversion of that array to a pointer to its first element. There is also a side effect of writing a value to part of that array, *p to the first element.
So, I don't see what gaurantees we have in the standard that the modification
of the otherwise uninitialized first element of the array will be sequenced
before the writing of the array address to p. What about this modification (writing *p to the first element) is different from the modification of writing
i+1 to i?
To put it another way, suppose an implementation looked at the statement of interest in the example as three tasks: 1st, allocate space for the compound literal object; 2nd: assign a pointer to said space to p; 3rd: write *p to the first element in the newly allocated space. The value computation for both RHS and LHS would be sequenced before the assignment, as computing the value of the RHS only requires the address. In what way is this hypothetical implementation not standard compliant?
You need to look at the definition of the assignment operator in 6.5.16.1
The side effect of updating the stored value of the left operand is
sequenced after the value computations of the left and right operands.
The evaluations of the operands are unsequenced.
So here you clearly see that first it evaluates the expressions on both sides in any order or even concurrently, and then stores the value of the right into the object designated by the left.
Additionally, you should know that LHS and RHS of an assignment are evaluated differently. Citations are a bit too long, so here is a summary
For the LHS the evaluation leaves "lvalues", that is objects such as
p, untouched. In particular it doesn't look at the contents of the
object.
For the RHS there is "lvalue conversion", that is for any object that is found there (e.g *p) the contents of that object is loaded.
If the RHS contains an lvalue of array type, this array is converted to a pointer to its first element. This is what is happening to your compound literal.
Edit: You added another question
What about this modification (writing *p to the first element) is
different from the modification of writing i+1 to i?
The difference is simply that i in the LHS of the assignment and thus has to be updated. The array from the compound literal is not in the LHS and thus is of no concern for the update.
Section 6.5p2 of the C standard details why this is valid:
If a side effect on a scalar object is unsequenced relative to either
a different side effect on the same scalar object or a value
computation using the value of the same scalar object, the behavior is
undefined. If there are multiple allowable orderings of the
subexpressions of an expression, the behavior is undefined if such an
unsequenced side effect occurs in any of the orderings. 84)
And footnote 84 states:
84) This paragraph renders undefined statement expressions such as
i = ++i + 1;
a[i++] = i;
while allowing
i = i + 1;
a[i] = i;
The posted snippet from 6.5.2.5 falls under the latter, as there is no side effect.
In (int [2]){*p}, *p provides an initial value for the compound literal. This is not an assignment, and it is not a side effect. The initial value is part of the object when the object is created. There is no moment when the array exists and it is not initialized.
In p = (int [2]){*p}, we know the side effect of updating p is sequenced after the computation of the right side because C 2011 [N1570] 6.5.16 3 says “The side effect of updating the stored value of the left operand is sequenced after the value computations of the left and right operands.”

Why can't you increment/decrement a variable twice in the same expression?

When I try to compile this code
int main() {
int i = 0;
++(++i);
}
I get this error message.
test.c:3:5: error: lvalue required as increment operand
++(++i);
^
What is the error message saying? Is this something that gets picked up by the parser, or is it only discovered during semantic analysis?
++i will give an rvalue1 after the evaluation and you can't apply ++ on an rvalue.
§6.5.3.1 (p1):
The operand of the prefix increment or decrement operator shall have atomic, qualified, or unqualified real or pointer type, and shall be a modifiable lvalue.
1. What is sometimes called "rvalue" is in this International Standard described as the "value of an expression". - §6.3.2.1 footnote 64).
A lvalue is a value you can write to / assign to.
You can apply ++ to i (i is modified) but you cannot apply ++ to the result of the previous ++ operator. I wouldn't have any effect anyway.
Aside: C++ allows that (probably because ++ operator returns a non-const reference on the modified value)
The issue that the (++i) returns new integer value, and please note ++ operation needs some variable for assignment, not a value (you are trying to increment an integer not a variable), so you can use this instead :
i += 2;
or
i = i + 2;

Postfix increment and derefencing of pointer variable in the same statement - what is executed first and what is executed next [duplicate]

I am learning C language and quite confused the differences between ++*ptr and *ptr++.
For example:
int x = 19;
int *ptr = &x;
I know ++*ptr and *ptr++ produce different results but I am not sure why is that?
These statements produce different results because of the way in which the operators bind. In particular, the prefix ++ operator has the same precedence as *, and they associate right-to-left. Thus
++*ptr
is parsed as
++(*ptr)
meaning "increment the value pointed at by ptr,". On the other hand, the postfix ++ operator has higher precedence than the dereferrence operator *. Thefore
*ptr++
means
*(ptr++)
which means "increment ptr to go to the element after the one it points at, then dereference its old value" (since postfix ++ hands back the value the pointer used to have).
In the context you described, you probably want to write ++*ptr, which would increment x indirectly through ptr. Writing *ptr++ would be dangerous because it would march ptr forward past x, and since x isn't part of an array the pointer would be dangling somewhere in memory (perhaps on top of itself!)
Hope this helps!
The accepted answer is not correct. It's not the case that the postfix ++ operator has the same precedence as dereference/indirection *. The prefix and postfix operators have different precedence, and only the prefix operator has the same precedence as dereference/indirection.
As the precedence table shows, postfix ++ has a higher precedence than dereference/indirection *. So *ptr++ gets evaluated as *(ptr++). ptr++ evaluates to the current value of ptr; it increments ptr only as a side effect. The value of the expression is the same as the current value of ptr. So it won't have any effect on the value stored at the pointer. It will merely dereference the pointer (i.e., get the current value stored there, which is 19), then advance the pointer. In your example there is no defined value stored at the new position of ptr, so the pointer is pointing to garbage. Dereferencing it now would be dangerous.
Also as the table shows, prefix ++ has the same precedence as dereference/indirection *, but because of right-left associativity, it gets evaluated as ++(*ptr). This will first dereference the pointer (i.e., get the value stored at the address pointed to) and then increment that value. I.e., the value will now be 20.
The accepted answer is correct about the effects of the two, but the actual mechanism is different from the one given there.
As templatetypedef says, but you should provide the parenthesis around *ptr to ensure the outcome. For instance, the following yields 1606415888 using GCC and 0 using CLang on my computer:
int x = 19;
int *ptr = &x;
printf("%d\n", *ptr++);
printf("%d\n", *ptr);
And you expected x to be 20. So use (*ptr)++ instead.

prefix or postfix operation on c pointer

#include<stdio.h>
void increment(int *p) {
*p = *p + 1;
}
void main() {
int a = 1;
increment(&a);
printf("%d", a);
}
for above if I run above code it prints 2
but if I replace *p = *p + 1; with *p++;
it is printing 1.Why is it so?...
operator precedence...
When writing *p++ you get these operations:
p++ is evaluated (later p will get incremented)
the original value of p is returned (since this is a suffix ++, if it was prefix, the value returned would have been p+1... )
dereference of the the pointer p and since p earlier pointed to 1, that's what you get
If you take a look at the precedence table for operators you will see that the postfix increment has a higher precedence than dereference. This means that *p++ will actually be grouped as *(p++).
You should use parenthesis to explicit what you are trying to do, in this case (*p)++ or ++(*p).
To answer this question consult the table of C operator precedence: operator + has lower precedence than dereference operator *, while increment operator ++ has higher precedence.
That is why ++ is applied to the pointer, while + 1 is applied to the result of pointer dereference.
Read about operator precedence. Check what happens when you do (*p)++.
Note: You may also try doing *(p++). But this will invoke undefined behaviour UB.
In the first case *p=*p+1; the control adds one to the value at *p by 1 and then stores the result in *p. This actually uses a temporary variable during program runtime which you are not able to see. Now,the value at the temporary instance is incremented and finally stored in *p.
In the second case, *p++; a sequence point concept comes.
According to C standard an object's stored value can be modified only once (by evaluation of expressions) between two sequence points.
A sequence point occurs:
at the end of full expressions
at the &&,|| and ?: operators.
at a function call (after evaluation of arguments,just before the actual call)
In *p++;, since the expression is modified only after the sequence point is encountered, *p is unable to store the modified value of itself.

Pointer Arithmetic: ++*ptr or *ptr++?

I am learning C language and quite confused the differences between ++*ptr and *ptr++.
For example:
int x = 19;
int *ptr = &x;
I know ++*ptr and *ptr++ produce different results but I am not sure why is that?
These statements produce different results because of the way in which the operators bind. In particular, the prefix ++ operator has the same precedence as *, and they associate right-to-left. Thus
++*ptr
is parsed as
++(*ptr)
meaning "increment the value pointed at by ptr,". On the other hand, the postfix ++ operator has higher precedence than the dereferrence operator *. Thefore
*ptr++
means
*(ptr++)
which means "increment ptr to go to the element after the one it points at, then dereference its old value" (since postfix ++ hands back the value the pointer used to have).
In the context you described, you probably want to write ++*ptr, which would increment x indirectly through ptr. Writing *ptr++ would be dangerous because it would march ptr forward past x, and since x isn't part of an array the pointer would be dangling somewhere in memory (perhaps on top of itself!)
Hope this helps!
The accepted answer is not correct. It's not the case that the postfix ++ operator has the same precedence as dereference/indirection *. The prefix and postfix operators have different precedence, and only the prefix operator has the same precedence as dereference/indirection.
As the precedence table shows, postfix ++ has a higher precedence than dereference/indirection *. So *ptr++ gets evaluated as *(ptr++). ptr++ evaluates to the current value of ptr; it increments ptr only as a side effect. The value of the expression is the same as the current value of ptr. So it won't have any effect on the value stored at the pointer. It will merely dereference the pointer (i.e., get the current value stored there, which is 19), then advance the pointer. In your example there is no defined value stored at the new position of ptr, so the pointer is pointing to garbage. Dereferencing it now would be dangerous.
Also as the table shows, prefix ++ has the same precedence as dereference/indirection *, but because of right-left associativity, it gets evaluated as ++(*ptr). This will first dereference the pointer (i.e., get the value stored at the address pointed to) and then increment that value. I.e., the value will now be 20.
The accepted answer is correct about the effects of the two, but the actual mechanism is different from the one given there.
As templatetypedef says, but you should provide the parenthesis around *ptr to ensure the outcome. For instance, the following yields 1606415888 using GCC and 0 using CLang on my computer:
int x = 19;
int *ptr = &x;
printf("%d\n", *ptr++);
printf("%d\n", *ptr);
And you expected x to be 20. So use (*ptr)++ instead.

Resources