How does C compiler determine a valid lvalue? - c

I'm trying to figure out how C determines if an expression is a valid LVALUE.
I know declaring variable gives it a named memory space, which is variable name. The variable name can be RVALUE or LVALUE. If used to represent a value its content is used, but if it is used as LVALUE its address is used to tell that the expression at right side is stored in this address. The picture I see for this operation is like ADDRESS=VALUE: That's how the right and left expressions for assignment operator are evaluated.
So why I can't define a variable like int a;, and then use the address of operator to store value in that address, like &a = 5;?
I know &a returns a constant pointer, but that means I can't change the address or I can't change the value stored in the address? If its content can't be changed, then why using *&a=5 works?
Why I can't assign a value this way, although the left hand expression is always evaluated to an address as I understand? Maybe something is wrong in my understanding?

Automatic lvalue conversion
This is covered by C 2018 6.3.2.1 2, which says:
Except when it is the operand of the sizeof operator, the unary & operator, the ++ operator, the -- operator, or the left operand of the . operator or an assignment operator, an lvalue that does not have array type is converted to the value stored in the designated object (and is no longer an lvalue); this is called lvalue conversion.…
Consider the expression x = y + z:
y is an operand of +. The + operator is not in the list of exceptions above. So y is converted to its value.
z is an operand of +. The + operator is not in the list of exceptions above. So z is converted to its value.
x is the left operand of =, which is the assignment operator. That is in the list of exceptions above. So x remains an lvalue.
About &a = 5
In regard to int a; followed by &a = 5;:
The result of the & operator is merely an address—it is just a value; there is no object holding this value, so it is not an lvalue.
The assignment operator must have an lvalue as its left operand. C 2018 6.5.16 2 is a constraint that says “An assignment operator shall have a modifiable lvalue as its left operand.”
Therefore &a = 5; violates a constraint, and a C compiler is required to produce a diagnostic message for it. The = operator cannot have a plain value as its left operand.
It is possible to design a programming language so that the assignment operator accepts &a = 5; and uses it to store the value on the right in the location given on the left. The BLISS language does this. In BLISS, the name of a variable always provides its address. To get the value, you must prefix the variable with a period (which acts like C’s unary * operator). So you would write z = .x + .y. So the fact that C does not do this is a choice about aesthetics and convenience, not about logical necessity. In C, lvalues are automatically converted to values in most places, and the exceptions are for operators that act on objects instead of values. In BLISS, you must explicitly designate each lvalue-to-value conversion.
About *a = 5
In *&a=5:
The * operator produces an lvalue, per C 2018 6.5.3.2 4: “The unary * operator denotes indirection. If the operand points to a function, the result is a function designator; if it points to an object, the result is an lvalue designating the object.…”
Thus *&a provides the lvalue that the assignment operator requires.

First of all, C does not use the term rvalue, preferring the term "value of an expression". The term lvalue is used, and it means (C11 6.3.2.1p1)
[...] an expression (with an object type other than void) that potentially designates an object)
It does not mean the address of the object, it means that the lvalue is the object.
The operand of & more often than not is an lvalue too
The operand of the unary & operator shall be either a function designator, the result of a [] or unary * operator, or an lvalue that designates an object that is not a bit-field and is not declared with the register storage-class specifier.
The result is a value of an expression of a pointer type, an address. Even though an address points to an object, it is not the object. Just like 1600 Pennsylvania Avenue NW in Washington, D.C. is an address, but it is not the building found at that address.
So if you have a house:
house my_House;
you can ask for its address
&my_house;
which is the address of your house, but it is not a house, i.e. not an lvalue, but the house located at the address of your house is a house, i.e. an lvalue:
*&my_house;

Related

What are lvalues and rvalues? [duplicate]

This question already has answers here:
What is the reasoning behind the naming of "lvalue" and "rvalue"?
(6 answers)
Closed 3 years ago.
I've heard the terms lvalue and rvalue come up when working with pointers.
However, I don't fully understand their meaning.
What are lvalues and rvalues?
Note 1: This is a question about C's lvalues and rvalues, not C++'s. It's also about their functionality, not their naming.
Note 2: I already fully understand these concepts. This is meant as a canonical duplicate target.
I've got a longer answer here, but basically C11 draft n1570 6.3.2.1p1:
An lvalue is an expression (with an object type other than void) that potentially designates an object [...]
C11 n1570 Footnote 64:
64) The name lvalue comes originally from the assignment expression E1 = E2, in which the left operand E1 is required to be a (modifiable) lvalue. It is perhaps better considered as representing an object locator value. What is sometimes called rvalue is in this International Standard described as the value of an expression. An obvious example of an lvalue is an identifier of an object. As a further example, if E is a unary expression that is a pointer to an object, *E is an lvalue that designates the object to which E points.
Not all lvalues are modifiable, i.e. can appear on the left side of an assignment. Examples of unmodifiable lvalues are those that
have array type,
have incomplete type
are const-qualified
structs or unions that have const-qualified members either directly or recursively
An lvalue can be converted to a value of an expression through lvalue conversion. I.e. in
int a = 0, b = 1;
a = b;
both a and b are lvalues, as they both potentially - and actually - designate objects, but b undergoes lvalue conversion on the right-hand side of the assignment, and the value of the expression b after lvalue conversion is 1.
"Potentially designating an object" means that given int *p;, *p designates an object of type int iff p points to an object of type int - but *p is an lvalue even if p == NULL or indeterminate.
According to the C Reference Manual (3rd Edition):
An lvalue is an expression that refers to an object in such a way that
the object may be examined or altered. Only an lvalue expression may
be used on the left-hand side of an assignment. An expression that is
not an lvalue is sometimes called an rvalue because it can only appear
on the right-hand side of an assignment. An lvalue can have an
incomplete array type, but not void.

Is the meaning of l-value different in c and c++?

I was told that the array name is an non-modifiable l-value in C, but it is still confusing.
Someone said that the array name can not be placed on the left side of the formula because it is converted to a pointer that is not l-value.
My question is Here:
is an array name l-value?
Is there any difference between what means l-value in c and c++?
is an array name l-value?
Yes, in both C and C++.
Is there any difference between what means l-value in c and c++?
Yes, but not of great significance. Here is the definition from C11, paragraph 6.3.2.1/1:
An lvalue is an expression (with an object type other than void) that potentially designates an object
C also includes a footnote (#64) expanding on that, which includes:
The name ''lvalue'' comes originally from the assignment expression E1 = E2, in which the left operand E1 is required to be a (modifiable) lvalue. It is perhaps better considered as representing an object
''locator value''. [...] An
obvious example of an lvalue is an identifier of an object.
Here is the definition from C++14, paragraph 3.10/1:
An lvalue (so called, historically, because lvalues could appear on
the left-hand side of an assignment expression) designates a function
or an object.
If you read carefully, you will notice that in C, an lvalue only potentially designates an object, whereas in C++, no room is left for unfulfilled potential -- an lvalue does designate an object or function. You'll also then notice that C++ includes function designators among its lvalues, whereas C does not. In practice, these distinctions are more technical than deeply meaningful. And neither of them affects the answer to your question (1).
You'll also note that neither definition is written in terms of how or where an lvalue can be used. That follows from the definition and other specifications; it is not a defining characteristic.
In both C and C++, an array's identifier designates an object -- the array -- and it is therefore an lvalue. Whether such an lvalue may in fact appear as the left operand in an assignment expression is an entirely separate question.
In the context of C:
6.3.2.1 Lvalues, arrays, and function designators
1 An lvalue is an expression (with an object type other than void) that potentially
designates an object;64) if an lvalue does not designate an object when it is evaluated, the
behavior is undefined. When an object is said to have a particular type, the type is
specified by the lvalue used to designate the object. A modifiable lvalue is an lvalue that
does not have array type, does not have an incomplete type, does not have a const-qualified type, and if it is a structure or union, does not have any member (including,
recursively, any member or element of all contained aggregates or unions) with a const-qualified type.
2 Except when it is the operand of the sizeof operator, the _Alignof operator, the
unary & operator, the ++ operator, the -- operator, or the left operand of the . operator
or an assignment operator, an lvalue that does not have array type is converted to the
value stored in the designated object (and is no longer an lvalue); this is called lvalue
conversion. If the lvalue has qualified type, the value has the unqualified version of the
type of the lvalue; additionally, if the lvalue has atomic type, the value has the non-atomic
version of the type of the lvalue; otherwise, the value has the type of the lvalue. If the
lvalue has an incomplete type and does not have array type, the behavior is undefined. If
the lvalue designates an object of automatic storage duration that could have been
declared with the register storage class (never had its address taken), and that object
is uninitialized (not declared with an initializer and no assignment to it has been
performed prior to use), the behavior is undefined.
3 Except when it is the operand of the sizeof operator, the _Alignof operator, or the
unary & operator, or is a string literal used to initialize an array, an expression that has
type ‘‘array of type’’ is converted to an expression with type ‘‘pointer to type’’ that points
to the initial element of the array object and is not an lvalue. If the array object has
register storage class, the behavior is undefined.
64) The name ‘‘lvalue’’ comes originally from the assignment expression E1 = E2, in which the left
operand E1 is required to be a (modifiable) lvalue. It is perhaps better considered as representing an
object ‘‘locator value’’. What is sometimes called ‘‘rvalue’’ is in this International Standard described
as the ‘‘value of an expression’’.
An obvious example of an lvalue is an identifier of an object. As a further example, if E is a unary
expression that is a pointer to an object, *E is an lvalue that designates the object to which E points.
C 2011 Online Draft
Summarizing:
An array expression (that is, any expression of array type) is indeed an lvalue; however, unless it is the operand of the sizeof, _Alignof, or unary & operators, that expression gets converted ("decays") to an expression of pointer type whose value is the address of the first element of the array, and that converted pointer expression is not an lvalue, and thus cannot be the target of an assignment.
That is, if you declare a as
T a[N]; // for any type `T`
then the expression a has type "N-element array of T". If a is not the operand of the sizeof, unary &, or _Alignof operators, it will be converted to an expression of type "pointer to T", and its value will be the same as &a[0], and that value cannot be the target of an assignment (it's logically the same as writing 2 = 3 - you're trying to assign a value to a value, not an object, which doesn't work).

Understanding indirection through pointers and taking address

In the Standard N1570, Section 6.5.3.2#3 the following is specified (emp. mine):
If the operand is the result of a unary * operator, neither that
operator nor the & operator is evaluated and the result is as if both
were omitted, except that the constraints on the operators still apply
and the result is not an lvalue.
Later on the section 6.5.3.2#4 specifies:
If the operand points to a function, the result is a function
designator; if it points to an object, the result is an lvalue
designating the object.
This two sections look contradictory to me. The first on I cited specifies that the result is not an lvalue, but the second one specifies that the result of indirection operator is an lvalue.
Can you please explain this? Does it mean that in case of object the operators * and & does not eliminate each other?
Section 6.5.3.2#3 talks about the unary & operator and the 6.5.3.2#4 talks about the unary * operator. They have different behaviors.
Elaboration (from comment):
The point is that unary & does not result in an lvalue, even in the case where it is considered omitted because it immediately precedes unary * in a dereference context. Just because both operators are considered omitted doesn't change the fact the resulting expression is not an lvalue; the same way it would not be if a solo unary & were applied.
int a;
&a = ...;
is not legal (obviously). But neither is
int a;
&*a = ...;
Just because they are considered omitted doesn't mean &* is lvalue-equivalent to solo a.

Does array subscription count as taking address of object?

This question is inspired by answers to this question.
Following code has potential for undefined behaviour:
uint64_t arr[1]; // Uninitialized
if(arr[0] == 0) {
C standard specifies that uninitialized variable with automatic storage duration has indeterminate value, which is either unspecified or trap representation. It also specifies that uintN_t types have no padding bits, and size and range of values are well defined; so trap representation for uint64_t is not possible.
So I conclude that uninitialized value itself is not undefined behavior. What about reading it?
6.3.2.1 Lvalues, arrays, and function designators
...
Except when it is the operand of the sizeof operator, the _Alignof operator, the unary & operator, the ++ operator, the -- operator, or the left operand of the . operator or an assignment operator, an lvalue that does not have array type is converted to the value stored in the designated object (and is no longer an lvalue); this is called lvalue
conversion. ... -- irrelevant text removed --
... If
the lvalue designates an object of automatic storage duration that could have been declared with the register storage class (never had its address taken), and that object is uninitialized (not declared with an initializer and no assignment to it has been performed prior to use), the behavior is undefined.
Except when it is the operand of the sizeof operator, the _Alignof operator, or the
unary & operator, or is a string literal used to initialize an array, an expression that has
type ‘‘array of type’’ is converted to an expression with type ‘‘pointer to type’’ that points
to the initial element of the array object and is not an lvalue. If the array object has
register storage class, the behavior is undefined.
Question: Does subscripting array count as taking the address of an object?
Following text seems to imply that subscripting array requires conversion to a pointer, which seems impossible to do without taking address:
6.5.2.1 Array subscripting
Constraints
One of the expressions shall have type ‘‘pointer to complete object type’’, the other
expression shall have integer type, and the result has type ‘‘type’’.
Semantics
A postfix expression followed by an expression in square brackets [] is a subscripted
designation of an element of an array object. The definition of the subscript operator []
is that E1[E2] is identical to (*((E1)+(E2))). Because of the conversion rules that
apply to the binary + operator, if E1 is an array object (equivalently, a pointer to the
initial element of an array object) and E2 is an integer, E1[E2] designates the E2-th element of E1 (counting from zero).
This makes §6.3.2.1 paragraph 3 seem weird. How could array have register storage class at all, if subscription requires conversion to a pointer?
Yes, array subscripting counts as taking the address, as per the part you quoted in 6.5.2.1. The expression E1 must have its address taken.
Therefore the special case of UB in 6.3.2.1 does not apply to array indexing. If array indices are used, it is not relevant if the array could be stored with register storage duration or not (a variable having its address taken cannot use register storage duration).
You are correct in assuming that reading an uninitialized stdint.h type with indeterminate value, which has its address taken, does not invoke undefined behavior (guaranteed by C11 7.20.1.1), but merely unspecified behavior. The value could be anything and it can be non-deterministic between several reads, but it cannot be a trap.
"Reading an uninitalized variable is always UB" is a wide-spread but incorrect myth.
Further information with normative sources in this answer.

lvalue required as unary ‘&’ operand -- passing function result as pointer

I have some problem with my code.
There are the following functions:
static Poly PolyFromCoeff(int coeff);
static Mono MonoFromPoly(const Poly *p, int exp);
And in another function I have this line:
Mono m = MonoFromPoly(&PolyFromCoeff(10),4);
But I receive this error message:
lvalue required as unary ‘&’ operand
If I save the first result to a variable, there is no error:
Poly p = PolyFromCoeff(10);
Mono m = MonoFromPoly(&p,4);
Why is the first solution wrong?
As it says, operator & requires a lvalue as its argument, i.e. it cannot be applied to temporary values. Addresses are not associated with values, with objects only.
In the second form you instantiate an object that holds this value and you can easily take the address of that object.
The C language expressly prohibits you from taking the address of a rvalue (which is what a function returns). This clause from the C11 standard (committee draft) sums it up:
6.5.3.2 Address and indirection operators
Constraints
The operand of the unary & operator shall be either a function designator, the result of a
[] or unary * operator, or an lvalue that designates an object that is not a bit-field and is
not declared with the register storage-class specifier.
If you are confused about lvalue and rvalue, think of it like this:
lvalue is something that has an identifier and storage
rvalue is a temporary result or literal value
If you have a C++ background, you might have been confused because the behavior of references is different. In C++, it's okay to have this:
static Poly PolyFromCoeff(int coeff);
static Mono MonoFromPoly(const Poly &p, int exp);
Mono m = MonoFromPoly( PolyFromCoeff(10), 4 );

Resources