With some friends we discuss about the corectness of this simple following code in ANSI C.
#include <stdio.h>
int main(void)
{
int a=2;
printf("%d", *&a);
return 0;
}
The main discuss is about *&. If * access a memory location (aka pointer) and & gives memory address of some variable... I think * tries to access a int value as memory adress ( that obviously don't work), but my friend says *& it cancels automatically ( or interpret as &*). We tested it with GCC 4.8.1 (MinGW) and the code avobe worked it well... I think was not right.
What do you think about? Think there's a bad workaround here ( or this is just stupidity?). Thanks in advice :)
a is an lvalue: the variable a.
&a is a pointer to this lvalue.
*&a is the lvalue being pointed to by &a — that is, it is a.
Technically speaking, *&a and a are not completely equivalent in all cases, in that *&a is not permitted in all circumstances where a is (for example, if a is declared as register), but in your example, they are completely the same.
There is an interesting excerpt from C standard (as a footnote), namely C11 §6.5.3.2/4 (footnote 102, emphasis mine), which discusses this aspect directly:
Thus, &*E is equivalent to E (even if E is a null pointer), and
&(E1[E2]) to ((E1)+(E2)). It is always true that if E is a function
designator or an lvalue that is a valid operand of the unary &
operator, *&E is a function designator or an lvalue equal to E. If *P
is an lvalue and T is the name of an object pointer type, *(T)P is an
lvalue that has a type compatible with that to which T points.
In your case a is a (modifiable) lvalue, that reflects to E symbol from standard and it's valid operand of the & operator as standard requires, thus *&a (i.e. *&E) is an lvalue equal to a.
Note that you can't take address of register storage class variable (as pointed by #Deduplicator), so it does not qualify into such reduction (that is, even as modifiable lvalue).
So long as *&a is meaningful, then *&a and a are the same thing and are interchangeable.
In general, *&a is the same as a.
Still, there are corner-cases:
*& may be invalid because &a is not allowed, as it is not an lvalue or it is of register-storage-class.
Using a may be Undefined Behavior, because a is an uninitialized memory-less variable (register or auto-storage-class which might have been declared register (address was never taken)).
Applying that to your case, leaving out *& does not change anything.
Related
'Address of' operator gives memory location of variables. So it can be used with variables.
I tried compiling this code.
#include<stdio.h>
int main()
{
int i=889,*j,*k;
j=&889;
k=*6422296;
printf("%d\n",j);
return 0;
}
It showed this error error: lvalue required as unary '&' operand for j=&889.
And I was expecting this error: invalid type argument of unary '*' (have 'int')| for k=*6422296.
6422296 is the memory location of variable i.
Can someone give examples of when '*' is used with constants and expressions?
P.S:- I have not yet seen any need for this But....
All constants in a program are also assigned some memory. Is it possible to determine their address with &? (Just wondering).
An expression that is a value (rvalue in C idom) may not represent a variable with a defined lifetime and for that reason you cannot take its address.
In the opposite direction, it is legal (and common) to dereference an expression:
int a[] = {1,2,3}
int *pt = a + 1; // pt points to the second element of the array
inf first = *(a - 1); // perfectly legal C
Dereferencing a constant is not common in C code. It only makes sense when dealing directly with the hardware, that is in kernel mode, or when programming for some embedded systems. Then you can have special registers that are mapped at well known addresses.
first_byte_of_screen = *((char *) 0xC0000); // may remember things to old MS/DOS programmers
But best practices would recommed to define a constant
#define SCREEN ((unsigned char *) 0xC0000)
first_byte = *SCREEN; // or even SCREEN[0] because it is the same thing
k=*6422296 means go to address number 6422296, read the content inside it and assign it to k, which is completely valid.
j=&889 means get me the address of 889, 889 is an rvalue, it's a temporary that theoretically only exists temporarily in the CPU registers, and might never even get stored in the memory, so asking for it's memory address makes no sense.
All constants in a program are also assigned some memory.
That's not necessarily the case for numeric literals; they're often hardcoded into the machine code instructions with no storage allocated for them.
A good rule of thumb is that anything that can't be the target of an assignment (such as a numeric literal) cannot be the operand of the unary & operator1.
6.5.3.2 Address and indirection operators
Constraints
1 The operand of the unary & operator shall be either a function designator, the result of a
[] or unary * operator, or an lvalue that designates an object that is not a bit-field and is
not declared with the register storage-class specifier.
2 The operand of the unary * operator shall have pointer type.
C 2011 Online Draft
*6422296 "works" (as in, doesn't result in a diagnostic from the compiler) because integer expressions can be converted to pointers:
6.3.2.3 Pointers
...
5 An integer may be converted to any pointer type. Except as previously specified, the
result is implementation-defined, might not be correctly aligned, might not point to an
entity of the referenced type, and might be a trap representation.67)
67) The mapping functions for converting a pointer to an integer or an integer to a pointer are intended to
be consistent with the addressing structure of the execution environment.
Like all rules of thumb, there are exceptions; array expressions are lvalues, but cannot be the target of an assignment; if you declare an array like int a[10];, then you can't reassign a such as a = some_other_array_expression ;. Such lvalues are known as non-modifiable lvalues.
Why is necessary do casting when I dereference a void pointer?
I have this example:
int x;
void* px = &x;
*px = 9;
Can you proof why this don't work?
By definition, a void pointer points to an I'm-not-sure-what-type-of-object.
By definition, when you use the unary * operator to access the object pointed to by a pointer, you must know (well, the compiler must know) what the type of the object is.
So we have just proved that we cannot directly dereference a void pointer using *; we must always explicitly cast the void pointer to some actual object pointer type first.
Now, in many people's minds, the "obvious" answer to "what type does/should a 'generic' pointer point to?" is "char". And, once upon a time, before the void type had been invented, character pointers were routinely used as "generic" pointers. So some compilers (including, notably, gcc) extend things a bit and let you do more (such as pointer arithmetic) with a void pointer than the standard requires.
So that might explain how code like that in your question might be able to "work". (In your case, though, since the pointed-to type was actually int, not char, if it "worked" it was only because you were on a little-endian machine.)
...And with that said, I find that the code in your question does not work for me, not even under gcc. It first gives me a non-fatal warning:
warning: dereferencing ‘void *’ pointer
But then it changes its mind and decides this is an error instead:
error: invalid use of void expression
A second compiler I tried said something similar:
error: incomplete type 'void' is not assignable
Addendum: To say a little more about why the pointed-to type is reuired when you dereference a pointer:
When you access a pointer using *, the compiler is going to emit code to fetch from (or maybe store to) the pointed-to location. But the compiler is going to have to emit code that accesses a certain number of bytes, and in many cases it may matter how those byte(s) are interpreted. Both the number and the interpretation of the bytes is determined by the type (that's what types are for), which is precisely why an actual, non-void type is required.
One of the best ways I know of appreciating this requirement is to consider code like
*p + 1
or, even better
*p += 1
If p points to a char, the compiler is probably going to emit some kind of an addb ("add byte") instruction.
If p points to an int, the compiler is going to emit an ordinary add instruction.
If p points to a float or double, the compiler is going to emit a floating-point addition instruction. And so on.
But if p is a void *, the compiler has no idea what to do. It complains (in the form of an error message) not just because the C standard says you can't dereference a void pointer, but more importantly, because the compiler simply doesn't know what to do with your code.
In short:
The target of an assignment expression must be a modifiable lvalue, which cannot be a void expression. This is because the void type does not represent any values - it denotes an absence of a value. You cannot create an object of type void.
If the expression px has type void *, then the expression *px has type void. Attempting to assign to *px is a constraint violation and the compiler is required to yell at you for it.
If you want to assign a new value to x through px, then you have to cast px to an int * before dereferencing:
*((int *)px) = 5;
Chapter and verse:
6.2.5 Types
...
19 The void type comprises an empty set of values; it is an incomplete object type that
cannot be completed.
...
6.3.2.1 Lvalues, arrays, and function designators
1 An lvalue is an expression (with an object type other than void) that potentially
designates an object;64) if an lvalue does not designate an object when it is evaluated, the
behavior is undefined. When an object is said to have a particular type, the type is
specified by the lvalue used to designate the object. A modifiable lvalue is an lvalue that
does not have array type, does not have an incomplete type, does not have a const-qualified type, and if it is a structure or union, does not have any member (including,
recursively, any member or element of all contained aggregates or unions) with a const-qualified type.
...
6.3.2.2 void
1 The (nonexistent) value of a void expression (an expression that has type void) shall not
be used in any way, and implicit or explicit conversions (except to void) shall not be
applied to such an expression. If an expression of any other type is evaluated as a void
expression, its value or designator is discarded. (A void expression is evaluated for its
side effects.)
...
6.3.2.3 Pointers
1 A pointer to void may be converted to or from a pointer to any object type. A pointer to
any object type may be converted to a pointer to void and back again; the result shall
compare equal to the original pointer.
...
6.5.3.2 Address and indirection operators
...
4 The unary * operator denotes indirection. If the operand points to a function, the result is
a function designator; if it points to an object, the result is an lvalue designating the
object. If the operand has type ‘‘pointer to type’’, the result has type ‘‘type’’. If an
invalid value has been assigned to the pointer, the behavior of the unary * operator is
undefined.102)
...
6.5.16 Assignment operators
...
Constraints
2 An assignment operator shall have a modifiable lvalue as its left operand.
More specifically, dereferencing a void pointer violates the wording of 6.5.3.2 Address and indirection operators, paragraph 4:
The unary * operator denotes indirection. If the operand points to a function, the result is a function designator; if it points to an object, the result is an lvalue designating the object. If the operand has type ''pointer to type'', the result has type ''type''. If an invalid value has been assigned to the pointer, the behavior of the unary * operator is undefined.
Since a pointer to void has no "type" - it can't be dereferenced. Note that this is beyond undefined behavior - it is a violation of the C language standard.
It probably doesn't work because it violates a rule in the ISO C standard which requires a diagnostic, and (I'm guessing) your compiler is treating that as a fatal situation.
According to ISO C99, as well as the C11 Draft (n1548), the only constraint on the use of the * dereferencing operator is "[t]he operand of the unary *operator shall have pointer type." [6.5.3.2¶2, n1548] The code we have here meets that constraint, and has no syntax error. Therefore no diagnostic is required for the use of the * operator.
However, what is the meaning of the * operator applied to a void * pointer?
"The unary * operator denotes indirection. If the operand points to a function, the result is a function designator; if it points to an object, the result is an lvalue designating the object. If the operand has type ‘‘pointer to type’’, the result has type ‘‘type’’. [6.5.3.2¶4, n1548]
The type void is neither a function nor an object type, so the middle sentence, which talks about producing a function or object designator, is not applicable to our case. The last sentence quoted above is applicable; it gives a requirement that an expression which dereferences a void * has void type.
Thus *px = 9; runs aground because it's assigning an int value to a void expression. An assignment requires an lvalue expression of object type; void is not an object type and the expression is certainly not an lvalue. The exact wording of the constraint is: "An assignment operator shall have a modifiable lvalue as its left operand." [6.5.16¶2, n1548] Violation of this constraint requires a diagnostic.
It appears from my perhaps naive reading of the standard that the expression *px as such is valid; only no attempt must be made to extract a result from it, or use it as the target of an assignment. If that is true, it could be used as an expression statement whose value is discarded: if (foo()) { *px; }, and it could be redundantly cast to void also: (void) *px. These apparently pointless situations might be somehow exploited by, or at least arise in, certain kinds of macros.
For instance, if we want to be sure that the argument of some macro is a pointer we can take advantage of the constraint that * requires a pointer operand:
#define MAC(NUM, PTR) ( ... (void) *(PTR) ...)
I.e. somewhere in the macro we dereference the pointer and throw away the result, which will diagnose if PTR isn't a pointer. It looks like ISO C allows this usage even if PTR is a void *, which is arguably useful.
I use to code pointers like this when I need to change the original memory address of a pointer.
Example:
static void get_line_func(struct data_s *data,
char **begin)
{
data->slot_number = strsep(&(*(begin)), "/");
data->protocol = *begin;
strsep(&(*begin), ">");
data->service_name = strsep(&(*begin), "\n");
}
I mean, isn't &(*foo) == foo?
There is no reason to do that directly. However, the combination can arise in machine-generated code (such as the expansion of a preprocessor macro).
For instance, suppose we have a macro do_something_to(obj) which expects the argument expression obj to designate an object. Suppose somewhere in its expansion, this macro takes the address of the object using &(obj). Now suppose we would like to apply the macro to an object which we only hold via a pointer ptr. To designate the object, we must use the expression *ptr so that we use the macro as do_something_to(*ptr). That of course means that&(*ptr) now occurs in the program.
The status of the expression &*ptr has changed over the years. I seem to remember that in the ANSI C 89 / ISO C90 dialect, the expression produced undefined behavior if ptr was an invalid pointer.
In ISO C11 the following is spelled out (and I believe nearly the same text is in C99), requiring &* not to dereference the pointer: "if the operand [of the address-of unary & operator] is the result of a unary * operator,
neither that operator nor the & operator is evaluated and the result is as if both were
omitted, except that the constraints on the operators still apply and the result is not an lvalue". Thus in the modern C dialect, the expression &*ptr doesn't dereference ptr, hence has defined behavior even if that value is null.
What does that mean? "constraints still apply" basically means that it still has to type check. Just because &*P doesn't dereference P doesn't mean that P can be a double or a struct; it has to be a pointer.
The "result is not an lvalue" part is potentially useful. If we have a pointer P which is an value, if we wrap it in the expression &*P, we obtain the same pointer value as a non-lvalue. There are other ways to obtain the value of P as a non-lvalue, but &*P is a "code golfed" solution to the problem requiring only two characters, and having the property that it will remain correct even if P changes from one pointer type to another.
Code sample:
struct name
{
int a, b;
};
int main()
{
&(((struct name *)NULL)->b);
}
Does this cause undefined behaviour? We could debate whether it "dereferences null", however C11 doesn't define the term "dereference".
6.5.3.2/4 clearly says that using * on a null pointer causes undefined behaviour; however it doesn't say the same for -> and also it does not define a -> b as being (*a).b ; it has separate definitions for each operator.
The semantics of -> in 6.5.2.3/4 says:
A postfix expression followed by the -> operator and an identifier designates a member
of a structure or union object. The value is that of the named member of the object to
which the first expression points, and is an lvalue.
However, NULL does not point to an object, so the second sentence seems underspecified.
Also relevant might be 6.5.3.2/1:
Constraints:
The operand of the unary & operator shall be either a function designator, the result of a
[] or unary * operator, or an lvalue that designates an object that is not a bit-field and is
not declared with the register storage-class specifier.
However I feel that the bolded text is defective and should read lvalue that potentially designates an object , as per 6.3.2.1/1 (definition of lvalue) -- C99 messed up the definition of lvalue, so C11 had to rewrite it and perhaps this section got missed.
6.3.2.1/1 does say:
An lvalue is an expression (with an object type other than void) that potentially
designates an object; if an lvalue does not designate an object when it is evaluated, the
behavior is undefined
however the & operator does evaluate its operand. (It doesn't access the stored value but that is different).
This long chain of reasoning seems to suggest that the code causes UB however it is fairly tenuous and it's not clear to me what the writers of the Standard intended. If in fact they intended anything, rather than leaving it up to us to debate :)
From a lawyer point of view, the expression &(((struct name *)NULL)->b); should lead to UB, since you could not find a path in which there would be no UB. IMHO the root cause is that at a moment you apply the -> operator on an expression that does not point to an object.
From a compiler point of view, assuming the compiler programmer was not overcomplicated, it is clear that the expression returns the same value as offsetof(name, b) would, and I'm pretty sure that provided it is compiled without error any existing compiler will give that result.
As written, we could not blame a compiler that would note that in the inner part you use operator -> on an expression than cannot point to an object (since it is null) and issue a warning or an error.
My conclusion is that until there is a special paragraph saying that provided it is only to take its address it is legal do dereference a null pointer, this expression is not legal C.
Yes, this use of -> has undefined behavior in the direct sense of the English term undefined.
The behavior is only defined if the first expression points to an object and not defined (=undefined) otherwise. In general you shouldn't search more in the term undefined, it means just that: the standard doesn't provide a meaning for your code. (Sometimes it points explicitly to such situations that it doesn't define, but this doesn't change the general meaning of the term.)
This is a slackness that is introduced to help compiler builders to deal with things. They may defined a behavior, even for the code that you are presenting. In particular, for a compiler implementation it is perfectly fine to use such code or similar for the offsetof macro. Making this code a constraint violation would block that path for compiler implementations.
Let's start with the indirection operator *:
6.5.3.2 p4:
The unary * operator denotes indirection. If the operand points to a function, the result is
a function designator; if it points to an object, the result is an lvalue designating the
object. If the operand has type "pointer to type", the result has type "type". If an
invalid value has been assigned to the pointer, the behavior of the unary * operator is
undefined. 102)
*E, where E is a null pointer, is undefined behavior.
There is a footnote that states:
102) Thus, &*E is equivalent to E (even if E is a null pointer), and &(E1[E2]) to ((E1)+(E2)). It is
always true that if E is a function designator or an lvalue that is a valid operand of the unary &
operator, *&E is a function designator or an lvalue equal to E. If *P is an lvalue and T is the name of
an object pointer type, *(T)P is an lvalue that has a type compatible with that to which T points.
Which means that &*E, where E is NULL, is defined, but the question is whether the same is true for &(*E).m, where E is a null pointer and its type is a struct that has a member m?
C Standard doesn't define that behavior.
If it were defined, new problems would arise, one of which is listed below. C Standard is correct to keep it undefined, and provides a macro offsetof that handles the problem internally.
6.3.2.3 Pointers
An integer constant expression with the value 0, or such an expression cast to type
void *, is called a null pointer constant. 66) If a null pointer constant is converted to a
pointer type, the resulting pointer, called a null pointer, is guaranteed to compare unequal
to a pointer to any object or function.
This means that an integer constant expression with the value 0 is converted to a null pointer constant.
But the value of a null pointer constant is not defined as 0. The value is implementation defined.
7.19 Common definitions
The macros are
NULL
which expands to an implementation-defined null pointer constant
This means C allows an implementation where the null pointer will have a value where all bits are set and using member access on that value will result in an overflow which is undefined behavior
Another problem is how do you evaluate &(*E).m? Do the brackets apply and is * evaluated first. Keeping it undefined solves this problem.
First, let's establish that we need a pointer to an object:
6.5.2.3 Structure and union members
4 A postfix expression followed by the -> operator and an identifier designates a member
of a structure or union object. The value is that of the named member of the object to
which the first expression points, and is an lvalue.96) If the first expression is a pointer to
a qualified type, the result has the so-qualified version of the type of the designated
member.
Unfortunately, no null pointer ever points to an object.
6.3.2.3 Pointers
3 An integer constant expression with the value 0, or such an expression cast to type
void *, is called a null pointer constant.66) If a null pointer constant is converted to a
pointer type, the resulting pointer, called a null pointer, is guaranteed to compare unequal
to a pointer to any object or function.
Result: Undefined Behavior.
As a side-note, some other things to chew over:
6.3.2.3 Pointers
4 Conversion of a null pointer to another pointer type yields a null pointer of that type.
Any two null pointers shall compare equal.
5 An integer may be converted to any pointer type. Except as previously specified, the
result is implementation-defined, might not be correctly aligned, might not point to an
entity of the referenced type, and might be a trap representation.67)
6 Any pointer type may be converted to an integer type. Except as previously specified, the
result is implementation-defined. If the result cannot be represented in the integer type,
the behavior is undefined. The result need not be in the range of values of any integer
type.
67) The mapping functions for converting a pointer to an integer or an integer to a pointer are intended to be consistent with the addressing structure of the execution environment.
So even if the UB should happen to be benign this time, it might still result in some totally unexpected number.
Nothing in the C standard would impose any requirements on what a system could do with the expression. It would, when the standard was written, have been perfectly reasonable for it to to cause the following sequence of events at runtime:
Code loads a null pointer into the addressing unit
Code asks the addressing unit to add the offset of field b.
The addressing unit trigger a trap when attempting to add an integer to a null pointer (which should for robustness be a run-time trap, even though many systems don't catch it)
The system starts executing essentially random code after being dispatched through a trap vector that was never set because code to set it would have wasted been a waste of memory, as addressing traps shouldn't occur.
The very essence of what Undefined Behavior meant at the time.
Note that most of the compilers that have appeared since the early days of C would regard the address of a member of an object located at a constant address as being a compile-time constant, but I don't think such behavior was mandated then, nor has anything been added to the standard which would mandate that compile-time address calculations involving null pointers be defined in cases where run-time calculations would not.
No. Let's take this apart:
&(((struct name *)NULL)->b);
is the same as:
struct name * ptr = NULL;
&(ptr->b);
The first line is obviously valid and well defined.
In the second line, we calculate the address of a field relative to the address 0x0 which is perfectly legal as well. The Amiga, for example, had the pointer to the kernel in the address 0x4. So you could use a method like this to call kernel functions.
In fact, the same approach is used on the C macro offsetof (wikipedia):
#define offsetof(st, m) ((size_t)(&((st *)0)->m))
So the confusion here revolves around the fact that NULL pointers are scary. But from a compiler and standard point of view, the expression is legal in C (C++ is a different beast since you can overload the & operator).
To my function i get a void pointer, I would like to point to the next location considering the incoming pointer is of char type.
int doSomething( void * somePtr )
{
((char*)somePtr)++; // Gives Compilation error
}
I get the following compilation error:
Error[Pe137]: expression must be a modifiable lvalue
Is this an issue with the priority of operators?
A cast does not yield an lvalue (see section 6.5.4 footnote 104 of C11 standard), therefore you can't apply post increment ++ operator to its result.
c-faq: 4.5:
In C, a cast operator does not mean "pretend these bits have a different type, and treat them accordingly"; it is a conversion operator, and by definition it yields an rvalue, which cannot be assigned to, or incremented with ++. (It is either an accident or a deliberate but nonstandard extension if a particular compiler accepts expressions such as the above.)
Try this instead
char *charPtr = ((char*)somePtr);
charPtr++;
If you want to move the pointer to next then you can use:
*ptr++;
If you want to Change copy the pointer position to another variable then:
char *abc = (char*)(def + 1);
It really depends on your motive to do things