Understanding indirection through pointers and taking address - c

In the Standard N1570, Section 6.5.3.2#3 the following is specified (emp. mine):
If the operand is the result of a unary * operator, neither that
operator nor the & operator is evaluated and the result is as if both
were omitted, except that the constraints on the operators still apply
and the result is not an lvalue.
Later on the section 6.5.3.2#4 specifies:
If the operand points to a function, the result is a function
designator; if it points to an object, the result is an lvalue
designating the object.
This two sections look contradictory to me. The first on I cited specifies that the result is not an lvalue, but the second one specifies that the result of indirection operator is an lvalue.
Can you please explain this? Does it mean that in case of object the operators * and & does not eliminate each other?

Section 6.5.3.2#3 talks about the unary & operator and the 6.5.3.2#4 talks about the unary * operator. They have different behaviors.
Elaboration (from comment):
The point is that unary & does not result in an lvalue, even in the case where it is considered omitted because it immediately precedes unary * in a dereference context. Just because both operators are considered omitted doesn't change the fact the resulting expression is not an lvalue; the same way it would not be if a solo unary & were applied.
int a;
&a = ...;
is not legal (obviously). But neither is
int a;
&*a = ...;
Just because they are considered omitted doesn't mean &* is lvalue-equivalent to solo a.

Related

Why the unary * operator does not have a constraint "the operand shall not be a pointer to void"?

C2x, 6.5.3.2 Address and indirection operators, Constraints, 2:
The operand of the unary * operator shall have pointer type.
Why there is no constraint "the operand shall not be a pointer to void"?
Though it can be deduced from:
C2x, 6.5.3.2 Address and indirection operators, Semantics, 4:
The unary * operator denotes indirection. If the operand points to a function, the result is a function designator; if it points to an object, the result is an lvalue designating the object.
C2x, 6.3.2.1 Lvalues, arrays, and function designators, 1:
An lvalue is an expression (with an object type other than void) that potentially designates an object; ...
One possible (though somewhat contrived, I'll admit) case where adding your 'suggested' constraint would break code is where the & and * operators are concatenated. In such cases, an expression such as a = &*p, where p is a void* type, is allowed.
From this Draft Standard, immediately following the section in your first citation (bold emphasis mine):
Semantics
3     The unary & operator yields the address of its
operand. If the operand has type ‘‘type’’, the result has type
‘‘pointer to type’’. If the operand is the result of a unary *
operator, neither that operator nor the & operator is evaluated
and the result is as if both were omitted, except that the
constraints on the operators still apply and the result is not an
lvalue. …
I can't, currently, think of a use-case for that &* combination (on a void* or any other pointer type) – but it may occur in code that is "auto-generated" and/or uses conditional macro expansion(s).

How does C compiler determine a valid lvalue?

I'm trying to figure out how C determines if an expression is a valid LVALUE.
I know declaring variable gives it a named memory space, which is variable name. The variable name can be RVALUE or LVALUE. If used to represent a value its content is used, but if it is used as LVALUE its address is used to tell that the expression at right side is stored in this address. The picture I see for this operation is like ADDRESS=VALUE: That's how the right and left expressions for assignment operator are evaluated.
So why I can't define a variable like int a;, and then use the address of operator to store value in that address, like &a = 5;?
I know &a returns a constant pointer, but that means I can't change the address or I can't change the value stored in the address? If its content can't be changed, then why using *&a=5 works?
Why I can't assign a value this way, although the left hand expression is always evaluated to an address as I understand? Maybe something is wrong in my understanding?
Automatic lvalue conversion
This is covered by C 2018 6.3.2.1 2, which says:
Except when it is the operand of the sizeof operator, the unary & operator, the ++ operator, the -- operator, or the left operand of the . operator or an assignment operator, an lvalue that does not have array type is converted to the value stored in the designated object (and is no longer an lvalue); this is called lvalue conversion.…
Consider the expression x = y + z:
y is an operand of +. The + operator is not in the list of exceptions above. So y is converted to its value.
z is an operand of +. The + operator is not in the list of exceptions above. So z is converted to its value.
x is the left operand of =, which is the assignment operator. That is in the list of exceptions above. So x remains an lvalue.
About &a = 5
In regard to int a; followed by &a = 5;:
The result of the & operator is merely an address—it is just a value; there is no object holding this value, so it is not an lvalue.
The assignment operator must have an lvalue as its left operand. C 2018 6.5.16 2 is a constraint that says “An assignment operator shall have a modifiable lvalue as its left operand.”
Therefore &a = 5; violates a constraint, and a C compiler is required to produce a diagnostic message for it. The = operator cannot have a plain value as its left operand.
It is possible to design a programming language so that the assignment operator accepts &a = 5; and uses it to store the value on the right in the location given on the left. The BLISS language does this. In BLISS, the name of a variable always provides its address. To get the value, you must prefix the variable with a period (which acts like C’s unary * operator). So you would write z = .x + .y. So the fact that C does not do this is a choice about aesthetics and convenience, not about logical necessity. In C, lvalues are automatically converted to values in most places, and the exceptions are for operators that act on objects instead of values. In BLISS, you must explicitly designate each lvalue-to-value conversion.
About *a = 5
In *&a=5:
The * operator produces an lvalue, per C 2018 6.5.3.2 4: “The unary * operator denotes indirection. If the operand points to a function, the result is a function designator; if it points to an object, the result is an lvalue designating the object.…”
Thus *&a provides the lvalue that the assignment operator requires.
First of all, C does not use the term rvalue, preferring the term "value of an expression". The term lvalue is used, and it means (C11 6.3.2.1p1)
[...] an expression (with an object type other than void) that potentially designates an object)
It does not mean the address of the object, it means that the lvalue is the object.
The operand of & more often than not is an lvalue too
The operand of the unary & operator shall be either a function designator, the result of a [] or unary * operator, or an lvalue that designates an object that is not a bit-field and is not declared with the register storage-class specifier.
The result is a value of an expression of a pointer type, an address. Even though an address points to an object, it is not the object. Just like 1600 Pennsylvania Avenue NW in Washington, D.C. is an address, but it is not the building found at that address.
So if you have a house:
house my_House;
you can ask for its address
&my_house;
which is the address of your house, but it is not a house, i.e. not an lvalue, but the house located at the address of your house is a house, i.e. an lvalue:
*&my_house;

Is (*&a) a lvalue or a rvalue?

First of all, I think it's a rvalue, but the following fact changed my mind.
I tried an expression as &(*&a) and it works fine, but the operator & can just work with a lvalue, so (*&a) is a lvalue, why?
Per C 2018 6.5.3.2 4 (discussing the unary * operator), the result of unary * is an lvalue:
… If the operand points to a function, the result is a function designator; if it points to an object, the result is an lvalue designating the object.…
This tells us that *&a is an lvalue. However, the expression asked about in the question is (*&a), so we must consider the effect of the parentheses.
6.3.2.1 2 (discussing automatic conversions) seems to tell us that (*&a) is converted to the value in *&a and is not an lvalue:
Except when it is the operand of the sizeof operator, the unary & operator, the ++ operator, the -- operator, or the left operand of the . operator or an assignment operator, an lvalue that does not have array type is converted to the value stored in the designated object (and is no longer an lvalue); this is called lvalue conversion.
However, 6.5.1 5 (discussing parenthesized expressions) contradicts this:
A parenthesized expression is a primary expression. Its type and value are identical to those of the unparenthesized expression. It is an lvalue, a function designator, or a void expression if the unparenthesized expression is, respectively, an lvalue, a function designator, or a void expression.
This is a defect in the C standard; 6.5.1 5 and 6.3.2.1 2 contradict each other. It is left to us to understand that 6.5.1 5, which is specifically about parenthesized expressions, takes precedence over the more general 6.3.2.1 2, and this is how all C implementations behave.
Thus (*&a) is an lvalue.
This expression &(&a) is invalid and will not work.
According to the C Stnadard
1 The operand of the unary & operator shall be either a function
designator, the result of a [] or unary * operator, or an lvalue
that designates an object that is not a bit-field and is not declared
with the register storage-class specifier.
and
3 The unary & operator yields the address of its operand. If the
operand has type ‘‘type’’, the result has type ‘‘pointer to type’’. If
the operand is the result of a unary * operator, neither that operator
nor the & operator is evaluated and the result is as if both were
omitted, except that the constraints on the operators still apply
and the result is not an lvalue.
So the result of the expression &a is not an lvalue. So you may not apply the operator & to the expression like &&a.
Here is a demonstrative program.
#include <stdio.h>
int main(void)
{
int x = 10;
&( &x );
return 0;
}
The compiler gcc 8.3 issues an error
prog.c: In function ‘main’:
prog.c:7:2: error: lvalue required as unary ‘&’ operand
&( &x );
^
This expression *&a is valid and the result is an lvalue.
4 The unary * operator denotes indirection. If the operand points to a
function, the result is a function designator; if it points to an
object, the result is an lvalue designating the object. If the
operand has type ‘‘pointer to type’’, the result has type ‘‘type’’. If
an invalid value has been assigned to the pointer, the behavior of the
unary * operator is undefined.
Bear in mind that parentheses do not influence on whether the enclosed expression is an lvalue or not.

Why can't you increment/decrement a variable twice in the same expression?

When I try to compile this code
int main() {
int i = 0;
++(++i);
}
I get this error message.
test.c:3:5: error: lvalue required as increment operand
++(++i);
^
What is the error message saying? Is this something that gets picked up by the parser, or is it only discovered during semantic analysis?
++i will give an rvalue1 after the evaluation and you can't apply ++ on an rvalue.
§6.5.3.1 (p1):
The operand of the prefix increment or decrement operator shall have atomic, qualified, or unqualified real or pointer type, and shall be a modifiable lvalue.
1. What is sometimes called "rvalue" is in this International Standard described as the "value of an expression". - §6.3.2.1 footnote 64).
A lvalue is a value you can write to / assign to.
You can apply ++ to i (i is modified) but you cannot apply ++ to the result of the previous ++ operator. I wouldn't have any effect anyway.
Aside: C++ allows that (probably because ++ operator returns a non-const reference on the modified value)
The issue that the (++i) returns new integer value, and please note ++ operation needs some variable for assignment, not a value (you are trying to increment an integer not a variable), so you can use this instead :
i += 2;
or
i = i + 2;

order of evaluation for multiple increment operator on pointer

Having trouble understanding, How following statement would be evaluated :
++*++ptr and *ptr++++
As per my understanding first would give me lValue required because after * is applied it would give value which cannot be used for ++ operator. But, the result is opposite. Please explain.
Second statement gives me error : batch3.c:6:21: error: lvalue required as increment operand
printf("%d", *ptr++++);
First, some standardese:
6.5.2.4 Postfix increment and decrement operators
Constraints
1 The operand of the postfix increment or decrement operator shall have atomic, qualified,
or unqualified real or pointer type, and shall be a modifiable lvalue.
Semantics
2 The result of the postfix ++ operator is the value of the operand. As a side effect, the
value of the operand object is incremented (that is, the value 1 of the appropriate type is
added to it). See the discussions of additive operators and compound assignment for
information on constraints, types, and conversions and the effects of operations on
pointers. The value computation of the result is sequenced before the side effect of
updating the stored value of the operand. With respect to an indeterminately-sequenced
function call, the operation of postfix ++ is a single evaluation. Postfix ++ on an object
with atomic type is a read-modify-write operation with memory_order_seq_cst
memory order semantics.98)
...
6.5.16 Assignment operators
...
3 An assignment operator stores a value in the object designated by the left operand. An
assignment expression has the value of the left operand after the assignment,111) but is not
an lvalue. The type of an assignment expression is the type the left operand would have
after lvalue conversion. The side effect of updating the stored value of the left operand is
sequenced after the value computations of the left and right operands. The evaluations of
the operands are unsequenced.
Emphasis mine.
The upshot of that wall of text is that the results of the expressions ptr++ and ++ptr are not lvalues. However, both expressions result in pointer values, so they may be the operands of the unary * operator, and the results of *ptr++ and *++ptr may be lvalues.
This is why ++*++ptr works; you're incrementing the result of *++ptr, which may be an lvalue. However, *ptr++++ is parsed as *(((ptr)++)++) (postfix ++ has higher precedence than unary *); the result of ptr++ is the operand to the second ++, but since the result of ptr++ is not an lvalue, the compiler complains. If you had written it as (*ptr++)++, then the expression would be valid.
In short:
++*++ptr - valid, equivalent to ++(*ptr++)
*++++ptr - invalid, equivalent to *(++(++ptr)), result of ++ptr is not an lvalue
++++*ptr - invalid, equivalent to ++(++(*ptr)), result of ++*ptr is not an lvalue
*ptr++++ - invalid, equivalent to *((ptr++)++), result pf ptr++ is not an lvalue
(*ptr)++++ - invalid, equivalent to ((*ptr)++)++, result of (*ptr)++ is not an lvalue
(*ptr++)++ - valid
++ operator has higher precedence over *
So the pointer will be incremented first and then dereferenced.
*p++
First
p++
Then *p++
For ++ there has to be a value which needs to be incremented. But the below expression doesn't provide the lvalue for the ++ . p++ is not a modifiable lvalue.
*ptr++++;
Considering that these two expressions are separate (note that and is defined <iso646.h>) here what happens:
First one ++*++ptr is equivalent to ++(*(++ptr)), as both prefix ++ and unary * have the same precedence and assiociativity is from right to left for both of them. See following example as an illustration:
#include <stdio.h>
int main(void)
{
int a[] = {1, 2};
int *ptr = a;
++(*(++ptr));
printf("%d\n", a[0]);
printf("%d\n", a[1]);
return 0;
}
Result:
1
3
The latter expression is not compilable, as ptr++ subexpression is not a modifiable lvalue. Note that postix ++ has higher precedence, that * (indirection operator) and its associativity is from left to right.

Resources