& operator cannot be applied to constants in C [duplicate] - c

This question already has answers here:
Using & (addressof) with const variables in C
(3 answers)
Closed 5 years ago.
According to The C programming language by Kernighan and Ritchie, page 94
& operator cannot be applied to constants
const int u = 9;
printf ("\nHello World! %u ", &u);
So, why does that work?

You are misunderstanding between constant and const. Both are different things in C language.
& operator cannot be applied to constants
In C programming, It means you cannot use &(address operator) on the literal constant to get the address of the literal constant.
For examples :
&10
or
&('a')
If you use & operator over literal constant, the compiler will give an error because constant entity does not have corresponding address.

According to the Draft Standard §6.5.3.2 ¶1, use of the address operator must obey the following constraint:
The operand of the unary & operator shall be either a function designator, the result of a [] or unary * operator, or an lvalue that designates an object that is not a bit-field and is not declared with the register storage-class specifier.
The meaning of lvalue is given in §6.3.2.1 ¶1:
An lvalue is an expression (with an object type other than void) that potentially designates an object.
According to §3.15 ¶1 an object is a:
region of data storage in the execution environment, the contents of which can represent values.
Now, a constant is not an lvalue; a constant has a value, but a constant does not indicate an object, and a value can't be assigned to a constant.
So, a constant is not a function designator, is not the result of a [] or unary * operator, and is not an lvalue, which means that taking the address of a constant is a constraint violation. A diagnostic message must be issued by a conforming implementation.
On the other hand, given const int u = 9;, u is a const qualified variable of type int. This declaration does reserve storage for the variable, and u is an lvalue. The use of const does not indicate that u is a constant, but that the object indicated by the identifier u is const (which is not the same thing). It may be better to think of const qualified variables as "read-only"; this is a promise made by the program that the object indicated by u will not be modified. Since u is an lvalue here (that is not a bit-field, and is not declared with the register keyword), it is fine to take its address with &u.
Do note that the posted code has undefined behavior, since the correct printf() conversion specifier for printing addresses is %p; its argument must be cast to (void *). Mismatched conversion specifiers and arguments lead to undefined behavior. So, the correct code would be:
const int u = 9;
printf ("\nHello World! %p ", (void *) &u);

No, in order to get variable's address in the memory, we can apply pointer to any of them, coz CPU allocates memory for all of them. On the other hand this question has already been asked.
The term "constant" in C indeed means only literal constants, like 2, for example. A const-qualified object is not a "constant" in C terminology
Link to question

Related

What does this statement about pointer operators mean? "& can be used only with a variable, * can be used with variable, constant or expression."

'Address of' operator gives memory location of variables. So it can be used with variables.
I tried compiling this code.
#include<stdio.h>
int main()
{
int i=889,*j,*k;
j=&889;
k=*6422296;
printf("%d\n",j);
return 0;
}
It showed this error error: lvalue required as unary '&' operand for j=&889.
And I was expecting this error: invalid type argument of unary '*' (have 'int')| for k=*6422296.
6422296 is the memory location of variable i.
Can someone give examples of when '*' is used with constants and expressions?
P.S:- I have not yet seen any need for this But....
All constants in a program are also assigned some memory. Is it possible to determine their address with &? (Just wondering).
An expression that is a value (rvalue in C idom) may not represent a variable with a defined lifetime and for that reason you cannot take its address.
In the opposite direction, it is legal (and common) to dereference an expression:
int a[] = {1,2,3}
int *pt = a + 1; // pt points to the second element of the array
inf first = *(a - 1); // perfectly legal C
Dereferencing a constant is not common in C code. It only makes sense when dealing directly with the hardware, that is in kernel mode, or when programming for some embedded systems. Then you can have special registers that are mapped at well known addresses.
first_byte_of_screen = *((char *) 0xC0000); // may remember things to old MS/DOS programmers
But best practices would recommed to define a constant
#define SCREEN ((unsigned char *) 0xC0000)
first_byte = *SCREEN; // or even SCREEN[0] because it is the same thing
k=*6422296 means go to address number 6422296, read the content inside it and assign it to k, which is completely valid.
j=&889 means get me the address of 889, 889 is an rvalue, it's a temporary that theoretically only exists temporarily in the CPU registers, and might never even get stored in the memory, so asking for it's memory address makes no sense.
All constants in a program are also assigned some memory.
That's not necessarily the case for numeric literals; they're often hardcoded into the machine code instructions with no storage allocated for them.
A good rule of thumb is that anything that can't be the target of an assignment (such as a numeric literal) cannot be the operand of the unary & operator1.
6.5.3.2 Address and indirection operators
Constraints
1 The operand of the unary & operator shall be either a function designator, the result of a
[] or unary * operator, or an lvalue that designates an object that is not a bit-field and is
not declared with the register storage-class specifier.
2 The operand of the unary * operator shall have pointer type.
C 2011 Online Draft
*6422296 "works" (as in, doesn't result in a diagnostic from the compiler) because integer expressions can be converted to pointers:
6.3.2.3 Pointers
...
5 An integer may be converted to any pointer type. Except as previously specified, the
result is implementation-defined, might not be correctly aligned, might not point to an
entity of the referenced type, and might be a trap representation.67)
67) The mapping functions for converting a pointer to an integer or an integer to a pointer are intended to
be consistent with the addressing structure of the execution environment.
Like all rules of thumb, there are exceptions; array expressions are lvalues, but cannot be the target of an assignment; if you declare an array like int a[10];, then you can't reassign a such as a = some_other_array_expression ;. Such lvalues are known as non-modifiable lvalues.

Why register array names can be assigned to pointer variables without compiler error?

I have a question about register keyword in C.
I found that register array name(e.g. array) can be assigned to pointer variable while &array[0] cannot be.
Can you explain why array name can be assigned to pointer? Or, if someone already explained it, please let me know links so that I can take a look to get the answer. Thank you.
Here is what I tried:
I read cppreference which explains register keyword and it says:
register arrays are not convertible to pointers.
Also, I read C89 draft which says:
The implementation may treat any register declaration simply as an auto declaration. However, whether or not addressable storage is actually used, the address of any part of an object declared with storage-class specifier register may not be computed, either explicitly (by use of the unary & operator as discussed in 3.3.3.2) or implicitly (by converting an array name to a pointer as discussed in 3.2.2.1). Thus the only operator that can be applied to an array declared with storage-class specifier register is sizeof.
This looks like I can't assign register array name to pointer to get its address.
Moreover, to find the answer, I searched here and found this question:
Address of register variable. There are good answers, however, still I couldn't find the answer what I wanted.
Here is the code which I tested. I compiled this code through Clang with flag -std=c89:
register int array[10];
int* p;
p = array; // Compiled without warning or error
p = &array[0]; // Compiled with error which I expected
I expected both p = array; and p = &array[0]; caused compiled error, but only p = &array[0]; made a compile error.
This is a bug in the Clang compiler.
GCC shows an error on both lines.
In discussing the automatic conversion of “array of type” to “pointer to type”, C 2018 6.3.2.1 says:
… If the array object has register storage class, the behavior is undefined.
Further, C 2018 footnote 124 says:
… the address of any part of an object declared with storage-class specifier register cannot be computed, either explicitly (by use of the unary & operator as discussed in 6.5.3.2) or implicitly (by converting an array name to a pointer as discussed in 6.3.2.1)…
Obviously, p = array; converts array to a pointer, as discussed in 6.3.2.1, but this footnote (which is non-normative but tells us the intent in this case quite explicitly) says the value for that pointer cannot be computed.
Since the behavior is not defined by the C standard, a C implementation could define it as an extension. However, the inconsistent behavior between array and &array[0] suggests this is not a deliberate extension by Clang but is a mistake.

Is there a reason to use C language pointers like this &(*foo)?

I use to code pointers like this when I need to change the original memory address of a pointer.
Example:
static void get_line_func(struct data_s *data,
char **begin)
{
data->slot_number = strsep(&(*(begin)), "/");
data->protocol = *begin;
strsep(&(*begin), ">");
data->service_name = strsep(&(*begin), "\n");
}
I mean, isn't &(*foo) == foo?
There is no reason to do that directly. However, the combination can arise in machine-generated code (such as the expansion of a preprocessor macro).
For instance, suppose we have a macro do_something_to(obj) which expects the argument expression obj to designate an object. Suppose somewhere in its expansion, this macro takes the address of the object using &(obj). Now suppose we would like to apply the macro to an object which we only hold via a pointer ptr. To designate the object, we must use the expression *ptr so that we use the macro as do_something_to(*ptr). That of course means that&(*ptr) now occurs in the program.
The status of the expression &*ptr has changed over the years. I seem to remember that in the ANSI C 89 / ISO C90 dialect, the expression produced undefined behavior if ptr was an invalid pointer.
In ISO C11 the following is spelled out (and I believe nearly the same text is in C99), requiring &* not to dereference the pointer: "if the operand [of the address-of unary & operator] is the result of a unary * operator,
neither that operator nor the & operator is evaluated and the result is as if both were
omitted, except that the constraints on the operators still apply and the result is not an lvalue". Thus in the modern C dialect, the expression &*ptr doesn't dereference ptr, hence has defined behavior even if that value is null.
What does that mean? "constraints still apply" basically means that it still has to type check. Just because &*P doesn't dereference P doesn't mean that P can be a double or a struct; it has to be a pointer.
The "result is not an lvalue" part is potentially useful. If we have a pointer P which is an value, if we wrap it in the expression &*P, we obtain the same pointer value as a non-lvalue. There are other ways to obtain the value of P as a non-lvalue, but &*P is a "code golfed" solution to the problem requiring only two characters, and having the property that it will remain correct even if P changes from one pointer type to another.

Dereferencing null pointer valid and working (not sizeof) [duplicate]

Code sample:
struct name
{
int a, b;
};
int main()
{
&(((struct name *)NULL)->b);
}
Does this cause undefined behaviour? We could debate whether it "dereferences null", however C11 doesn't define the term "dereference".
6.5.3.2/4 clearly says that using * on a null pointer causes undefined behaviour; however it doesn't say the same for -> and also it does not define a -> b as being (*a).b ; it has separate definitions for each operator.
The semantics of -> in 6.5.2.3/4 says:
A postfix expression followed by the -> operator and an identifier designates a member
of a structure or union object. The value is that of the named member of the object to
which the first expression points, and is an lvalue.
However, NULL does not point to an object, so the second sentence seems underspecified.
Also relevant might be 6.5.3.2/1:
Constraints:
The operand of the unary & operator shall be either a function designator, the result of a
[] or unary * operator, or an lvalue that designates an object that is not a bit-field and is
not declared with the register storage-class specifier.
However I feel that the bolded text is defective and should read lvalue that potentially designates an object , as per 6.3.2.1/1 (definition of lvalue) -- C99 messed up the definition of lvalue, so C11 had to rewrite it and perhaps this section got missed.
6.3.2.1/1 does say:
An lvalue is an expression (with an object type other than void) that potentially
designates an object; if an lvalue does not designate an object when it is evaluated, the
behavior is undefined
however the & operator does evaluate its operand. (It doesn't access the stored value but that is different).
This long chain of reasoning seems to suggest that the code causes UB however it is fairly tenuous and it's not clear to me what the writers of the Standard intended. If in fact they intended anything, rather than leaving it up to us to debate :)
From a lawyer point of view, the expression &(((struct name *)NULL)->b); should lead to UB, since you could not find a path in which there would be no UB. IMHO the root cause is that at a moment you apply the -> operator on an expression that does not point to an object.
From a compiler point of view, assuming the compiler programmer was not overcomplicated, it is clear that the expression returns the same value as offsetof(name, b) would, and I'm pretty sure that provided it is compiled without error any existing compiler will give that result.
As written, we could not blame a compiler that would note that in the inner part you use operator -> on an expression than cannot point to an object (since it is null) and issue a warning or an error.
My conclusion is that until there is a special paragraph saying that provided it is only to take its address it is legal do dereference a null pointer, this expression is not legal C.
Yes, this use of -> has undefined behavior in the direct sense of the English term undefined.
The behavior is only defined if the first expression points to an object and not defined (=undefined) otherwise. In general you shouldn't search more in the term undefined, it means just that: the standard doesn't provide a meaning for your code. (Sometimes it points explicitly to such situations that it doesn't define, but this doesn't change the general meaning of the term.)
This is a slackness that is introduced to help compiler builders to deal with things. They may defined a behavior, even for the code that you are presenting. In particular, for a compiler implementation it is perfectly fine to use such code or similar for the offsetof macro. Making this code a constraint violation would block that path for compiler implementations.
Let's start with the indirection operator *:
6.5.3.2 p4:
The unary * operator denotes indirection. If the operand points to a function, the result is
a function designator; if it points to an object, the result is an lvalue designating the
object. If the operand has type "pointer to type", the result has type "type". If an
invalid value has been assigned to the pointer, the behavior of the unary * operator is
undefined. 102)
*E, where E is a null pointer, is undefined behavior.
There is a footnote that states:
102) Thus, &*E is equivalent to E (even if E is a null pointer), and &(E1[E2]) to ((E1)+(E2)). It is
always true that if E is a function designator or an lvalue that is a valid operand of the unary &
operator, *&E is a function designator or an lvalue equal to E. If *P is an lvalue and T is the name of
an object pointer type, *(T)P is an lvalue that has a type compatible with that to which T points.
Which means that &*E, where E is NULL, is defined, but the question is whether the same is true for &(*E).m, where E is a null pointer and its type is a struct that has a member m?
C Standard doesn't define that behavior.
If it were defined, new problems would arise, one of which is listed below. C Standard is correct to keep it undefined, and provides a macro offsetof that handles the problem internally.
6.3.2.3 Pointers
An integer constant expression with the value 0, or such an expression cast to type
void *, is called a null pointer constant. 66) If a null pointer constant is converted to a
pointer type, the resulting pointer, called a null pointer, is guaranteed to compare unequal
to a pointer to any object or function.
This means that an integer constant expression with the value 0 is converted to a null pointer constant.
But the value of a null pointer constant is not defined as 0. The value is implementation defined.
7.19 Common definitions
The macros are
NULL
which expands to an implementation-defined null pointer constant
This means C allows an implementation where the null pointer will have a value where all bits are set and using member access on that value will result in an overflow which is undefined behavior
Another problem is how do you evaluate &(*E).m? Do the brackets apply and is * evaluated first. Keeping it undefined solves this problem.
First, let's establish that we need a pointer to an object:
6.5.2.3 Structure and union members
4 A postfix expression followed by the -> operator and an identifier designates a member
of a structure or union object. The value is that of the named member of the object to
which the first expression points, and is an lvalue.96) If the first expression is a pointer to
a qualified type, the result has the so-qualified version of the type of the designated
member.
Unfortunately, no null pointer ever points to an object.
6.3.2.3 Pointers
3 An integer constant expression with the value 0, or such an expression cast to type
void *, is called a null pointer constant.66) If a null pointer constant is converted to a
pointer type, the resulting pointer, called a null pointer, is guaranteed to compare unequal
to a pointer to any object or function.
Result: Undefined Behavior.
As a side-note, some other things to chew over:
6.3.2.3 Pointers
4 Conversion of a null pointer to another pointer type yields a null pointer of that type.
Any two null pointers shall compare equal.
5 An integer may be converted to any pointer type. Except as previously specified, the
result is implementation-defined, might not be correctly aligned, might not point to an
entity of the referenced type, and might be a trap representation.67)
6 Any pointer type may be converted to an integer type. Except as previously specified, the
result is implementation-defined. If the result cannot be represented in the integer type,
the behavior is undefined. The result need not be in the range of values of any integer
type.
67) The mapping functions for converting a pointer to an integer or an integer to a pointer are intended to be consistent with the addressing structure of the execution environment.
So even if the UB should happen to be benign this time, it might still result in some totally unexpected number.
Nothing in the C standard would impose any requirements on what a system could do with the expression. It would, when the standard was written, have been perfectly reasonable for it to to cause the following sequence of events at runtime:
Code loads a null pointer into the addressing unit
Code asks the addressing unit to add the offset of field b.
The addressing unit trigger a trap when attempting to add an integer to a null pointer (which should for robustness be a run-time trap, even though many systems don't catch it)
The system starts executing essentially random code after being dispatched through a trap vector that was never set because code to set it would have wasted been a waste of memory, as addressing traps shouldn't occur.
The very essence of what Undefined Behavior meant at the time.
Note that most of the compilers that have appeared since the early days of C would regard the address of a member of an object located at a constant address as being a compile-time constant, but I don't think such behavior was mandated then, nor has anything been added to the standard which would mandate that compile-time address calculations involving null pointers be defined in cases where run-time calculations would not.
No. Let's take this apart:
&(((struct name *)NULL)->b);
is the same as:
struct name * ptr = NULL;
&(ptr->b);
The first line is obviously valid and well defined.
In the second line, we calculate the address of a field relative to the address 0x0 which is perfectly legal as well. The Amiga, for example, had the pointer to the kernel in the address 0x4. So you could use a method like this to call kernel functions.
In fact, the same approach is used on the C macro offsetof (wikipedia):
#define offsetof(st, m) ((size_t)(&((st *)0)->m))
So the confusion here revolves around the fact that NULL pointers are scary. But from a compiler and standard point of view, the expression is legal in C (C++ is a different beast since you can overload the & operator).

Why can this C code run correctly? [duplicate]

This question already has answers here:
Why does sizeof(x++) not increment x?
(10 answers)
Closed 7 years ago.
The C code likes this:
#include <stdio.h>
#include <unistd.h>
#define DIM(a) (sizeof(a)/sizeof(a[0]))
struct obj
{
int a[1];
};
int main()
{
struct obj *p = NULL;
printf("%d\n",DIM(p->a));
return 0;
}
This object pointer p is NULL, so, i think this p->a is illegal.
But i have tested this code in Ubuntu14.04, it can execute correctly. So, I want to know why...
Note: the original code had int a[0] above but I've changed that to int a[1] since everyone seems to be hung up on that rather than the actual question, which is:
Is the expression sizeof(p->a) valid when p is equal to NULL?
Because sizeof is a compile time construction, it does not depend on evaluating the input. sizeof(p->a) gets evaluated based on the declared type of the member p::a solely, and becomes a constant in the executable. So the fact that p points to null makes no difference.
The runtime value of p plays absolutely no role in the expression sizeof(p->a).
In C and C++, sizeof is an operator and not a function. It can be applied to either a type-id or an expression. Except in the case that of an expression and the expression is a variable-length array (new in C99) (as pointed out by paxdiablo), the expression is an unevaluated operand and the result is the same as if you had taken sizeof against the type of that expression instead. (C.f. C11 references due to paxdiablo below, C++14 working draft 5.3.3.1)
First up, if you want truly portable code, you shouldn't be attempting to create an array of size zero1, as you did in your original question, now fixed. But, since it's not really relevant to your question of whether sizeof(p->a) is valid when p == NULL, we can ignore it for now.
From C11 section 6.5.3.4 The sizeof and _Alignof operators (my bold):
2/ The sizeof operator yields the size (in bytes) of its operand, which may be an expression or the parenthesized name of a type. The size is determined from the type of the operand. The result is an integer. If the type of the operand is a variable length array type, the operand is evaluated; otherwise, the operand is not evaluated and the result is an integer constant.
Therefore no evaluation of the operand is done unless it's a variable length array (which your example is not). Only the type itself is used to figure out the size.
1 For the language lawyers out there, C11 states in 6.7.6.2 Array declarators (my bold):
1/ In addition to optional type qualifiers and the keyword static, the [ and ] may delimit an expression or *. If they delimit an expression (which specifies the size of an array), the expression shall have an integer type. If the expression is a constant expression, it shall have a value greater than zero.
However, since that's in the constraints section (where shall and shall not do not involve undefined behaviour), it simply means the program itself is not strictly conforming. It's still covered by the standard itself.
This code contains a constraint violation in ISO C because of:
struct obj
{
int a[0];
};
Zero-sized arrays are not permitted anywhere. Therefore the C standard does not define the behaviour of this program (although there seems to be some debate about that).
The code can only "run correctly" if your compiler implements a non-standard extension to allow zero-sized arrays.
Extensions must be documented (C11 4/8), so hopefully your compiler's documentation defines its behaviour for struct obj (a zero-sized struct?) and the value of sizeof p->a, and whether or not sizeof evaluates its operand when the operand denotes a zero-sized array.
sizeof() doesn't care a thing about the content of anything, it merely looks at the resulting type of the expression.
Since C99 and variable length arrays, it is computed at run time when a variable length array is part of the expression in the sizeof operand.Otherwise, the operand is not evaluated and the result is an integer constant
Zero-size array declarations within structs was never permitted by any C standard, but some older compilers allowed it before it became standard for compilers to allow incomplete array declarations with empty brackets(flexible array members).

Resources