I use to code pointers like this when I need to change the original memory address of a pointer.
Example:
static void get_line_func(struct data_s *data,
char **begin)
{
data->slot_number = strsep(&(*(begin)), "/");
data->protocol = *begin;
strsep(&(*begin), ">");
data->service_name = strsep(&(*begin), "\n");
}
I mean, isn't &(*foo) == foo?
There is no reason to do that directly. However, the combination can arise in machine-generated code (such as the expansion of a preprocessor macro).
For instance, suppose we have a macro do_something_to(obj) which expects the argument expression obj to designate an object. Suppose somewhere in its expansion, this macro takes the address of the object using &(obj). Now suppose we would like to apply the macro to an object which we only hold via a pointer ptr. To designate the object, we must use the expression *ptr so that we use the macro as do_something_to(*ptr). That of course means that&(*ptr) now occurs in the program.
The status of the expression &*ptr has changed over the years. I seem to remember that in the ANSI C 89 / ISO C90 dialect, the expression produced undefined behavior if ptr was an invalid pointer.
In ISO C11 the following is spelled out (and I believe nearly the same text is in C99), requiring &* not to dereference the pointer: "if the operand [of the address-of unary & operator] is the result of a unary * operator,
neither that operator nor the & operator is evaluated and the result is as if both were
omitted, except that the constraints on the operators still apply and the result is not an lvalue". Thus in the modern C dialect, the expression &*ptr doesn't dereference ptr, hence has defined behavior even if that value is null.
What does that mean? "constraints still apply" basically means that it still has to type check. Just because &*P doesn't dereference P doesn't mean that P can be a double or a struct; it has to be a pointer.
The "result is not an lvalue" part is potentially useful. If we have a pointer P which is an value, if we wrap it in the expression &*P, we obtain the same pointer value as a non-lvalue. There are other ways to obtain the value of P as a non-lvalue, but &*P is a "code golfed" solution to the problem requiring only two characters, and having the property that it will remain correct even if P changes from one pointer type to another.
Related
'Address of' operator gives memory location of variables. So it can be used with variables.
I tried compiling this code.
#include<stdio.h>
int main()
{
int i=889,*j,*k;
j=&889;
k=*6422296;
printf("%d\n",j);
return 0;
}
It showed this error error: lvalue required as unary '&' operand for j=&889.
And I was expecting this error: invalid type argument of unary '*' (have 'int')| for k=*6422296.
6422296 is the memory location of variable i.
Can someone give examples of when '*' is used with constants and expressions?
P.S:- I have not yet seen any need for this But....
All constants in a program are also assigned some memory. Is it possible to determine their address with &? (Just wondering).
An expression that is a value (rvalue in C idom) may not represent a variable with a defined lifetime and for that reason you cannot take its address.
In the opposite direction, it is legal (and common) to dereference an expression:
int a[] = {1,2,3}
int *pt = a + 1; // pt points to the second element of the array
inf first = *(a - 1); // perfectly legal C
Dereferencing a constant is not common in C code. It only makes sense when dealing directly with the hardware, that is in kernel mode, or when programming for some embedded systems. Then you can have special registers that are mapped at well known addresses.
first_byte_of_screen = *((char *) 0xC0000); // may remember things to old MS/DOS programmers
But best practices would recommed to define a constant
#define SCREEN ((unsigned char *) 0xC0000)
first_byte = *SCREEN; // or even SCREEN[0] because it is the same thing
k=*6422296 means go to address number 6422296, read the content inside it and assign it to k, which is completely valid.
j=&889 means get me the address of 889, 889 is an rvalue, it's a temporary that theoretically only exists temporarily in the CPU registers, and might never even get stored in the memory, so asking for it's memory address makes no sense.
All constants in a program are also assigned some memory.
That's not necessarily the case for numeric literals; they're often hardcoded into the machine code instructions with no storage allocated for them.
A good rule of thumb is that anything that can't be the target of an assignment (such as a numeric literal) cannot be the operand of the unary & operator1.
6.5.3.2 Address and indirection operators
Constraints
1 The operand of the unary & operator shall be either a function designator, the result of a
[] or unary * operator, or an lvalue that designates an object that is not a bit-field and is
not declared with the register storage-class specifier.
2 The operand of the unary * operator shall have pointer type.
C 2011 Online Draft
*6422296 "works" (as in, doesn't result in a diagnostic from the compiler) because integer expressions can be converted to pointers:
6.3.2.3 Pointers
...
5 An integer may be converted to any pointer type. Except as previously specified, the
result is implementation-defined, might not be correctly aligned, might not point to an
entity of the referenced type, and might be a trap representation.67)
67) The mapping functions for converting a pointer to an integer or an integer to a pointer are intended to
be consistent with the addressing structure of the execution environment.
Like all rules of thumb, there are exceptions; array expressions are lvalues, but cannot be the target of an assignment; if you declare an array like int a[10];, then you can't reassign a such as a = some_other_array_expression ;. Such lvalues are known as non-modifiable lvalues.
Code sample:
struct name
{
int a, b;
};
int main()
{
&(((struct name *)NULL)->b);
}
Does this cause undefined behaviour? We could debate whether it "dereferences null", however C11 doesn't define the term "dereference".
6.5.3.2/4 clearly says that using * on a null pointer causes undefined behaviour; however it doesn't say the same for -> and also it does not define a -> b as being (*a).b ; it has separate definitions for each operator.
The semantics of -> in 6.5.2.3/4 says:
A postfix expression followed by the -> operator and an identifier designates a member
of a structure or union object. The value is that of the named member of the object to
which the first expression points, and is an lvalue.
However, NULL does not point to an object, so the second sentence seems underspecified.
Also relevant might be 6.5.3.2/1:
Constraints:
The operand of the unary & operator shall be either a function designator, the result of a
[] or unary * operator, or an lvalue that designates an object that is not a bit-field and is
not declared with the register storage-class specifier.
However I feel that the bolded text is defective and should read lvalue that potentially designates an object , as per 6.3.2.1/1 (definition of lvalue) -- C99 messed up the definition of lvalue, so C11 had to rewrite it and perhaps this section got missed.
6.3.2.1/1 does say:
An lvalue is an expression (with an object type other than void) that potentially
designates an object; if an lvalue does not designate an object when it is evaluated, the
behavior is undefined
however the & operator does evaluate its operand. (It doesn't access the stored value but that is different).
This long chain of reasoning seems to suggest that the code causes UB however it is fairly tenuous and it's not clear to me what the writers of the Standard intended. If in fact they intended anything, rather than leaving it up to us to debate :)
From a lawyer point of view, the expression &(((struct name *)NULL)->b); should lead to UB, since you could not find a path in which there would be no UB. IMHO the root cause is that at a moment you apply the -> operator on an expression that does not point to an object.
From a compiler point of view, assuming the compiler programmer was not overcomplicated, it is clear that the expression returns the same value as offsetof(name, b) would, and I'm pretty sure that provided it is compiled without error any existing compiler will give that result.
As written, we could not blame a compiler that would note that in the inner part you use operator -> on an expression than cannot point to an object (since it is null) and issue a warning or an error.
My conclusion is that until there is a special paragraph saying that provided it is only to take its address it is legal do dereference a null pointer, this expression is not legal C.
Yes, this use of -> has undefined behavior in the direct sense of the English term undefined.
The behavior is only defined if the first expression points to an object and not defined (=undefined) otherwise. In general you shouldn't search more in the term undefined, it means just that: the standard doesn't provide a meaning for your code. (Sometimes it points explicitly to such situations that it doesn't define, but this doesn't change the general meaning of the term.)
This is a slackness that is introduced to help compiler builders to deal with things. They may defined a behavior, even for the code that you are presenting. In particular, for a compiler implementation it is perfectly fine to use such code or similar for the offsetof macro. Making this code a constraint violation would block that path for compiler implementations.
Let's start with the indirection operator *:
6.5.3.2 p4:
The unary * operator denotes indirection. If the operand points to a function, the result is
a function designator; if it points to an object, the result is an lvalue designating the
object. If the operand has type "pointer to type", the result has type "type". If an
invalid value has been assigned to the pointer, the behavior of the unary * operator is
undefined. 102)
*E, where E is a null pointer, is undefined behavior.
There is a footnote that states:
102) Thus, &*E is equivalent to E (even if E is a null pointer), and &(E1[E2]) to ((E1)+(E2)). It is
always true that if E is a function designator or an lvalue that is a valid operand of the unary &
operator, *&E is a function designator or an lvalue equal to E. If *P is an lvalue and T is the name of
an object pointer type, *(T)P is an lvalue that has a type compatible with that to which T points.
Which means that &*E, where E is NULL, is defined, but the question is whether the same is true for &(*E).m, where E is a null pointer and its type is a struct that has a member m?
C Standard doesn't define that behavior.
If it were defined, new problems would arise, one of which is listed below. C Standard is correct to keep it undefined, and provides a macro offsetof that handles the problem internally.
6.3.2.3 Pointers
An integer constant expression with the value 0, or such an expression cast to type
void *, is called a null pointer constant. 66) If a null pointer constant is converted to a
pointer type, the resulting pointer, called a null pointer, is guaranteed to compare unequal
to a pointer to any object or function.
This means that an integer constant expression with the value 0 is converted to a null pointer constant.
But the value of a null pointer constant is not defined as 0. The value is implementation defined.
7.19 Common definitions
The macros are
NULL
which expands to an implementation-defined null pointer constant
This means C allows an implementation where the null pointer will have a value where all bits are set and using member access on that value will result in an overflow which is undefined behavior
Another problem is how do you evaluate &(*E).m? Do the brackets apply and is * evaluated first. Keeping it undefined solves this problem.
First, let's establish that we need a pointer to an object:
6.5.2.3 Structure and union members
4 A postfix expression followed by the -> operator and an identifier designates a member
of a structure or union object. The value is that of the named member of the object to
which the first expression points, and is an lvalue.96) If the first expression is a pointer to
a qualified type, the result has the so-qualified version of the type of the designated
member.
Unfortunately, no null pointer ever points to an object.
6.3.2.3 Pointers
3 An integer constant expression with the value 0, or such an expression cast to type
void *, is called a null pointer constant.66) If a null pointer constant is converted to a
pointer type, the resulting pointer, called a null pointer, is guaranteed to compare unequal
to a pointer to any object or function.
Result: Undefined Behavior.
As a side-note, some other things to chew over:
6.3.2.3 Pointers
4 Conversion of a null pointer to another pointer type yields a null pointer of that type.
Any two null pointers shall compare equal.
5 An integer may be converted to any pointer type. Except as previously specified, the
result is implementation-defined, might not be correctly aligned, might not point to an
entity of the referenced type, and might be a trap representation.67)
6 Any pointer type may be converted to an integer type. Except as previously specified, the
result is implementation-defined. If the result cannot be represented in the integer type,
the behavior is undefined. The result need not be in the range of values of any integer
type.
67) The mapping functions for converting a pointer to an integer or an integer to a pointer are intended to be consistent with the addressing structure of the execution environment.
So even if the UB should happen to be benign this time, it might still result in some totally unexpected number.
Nothing in the C standard would impose any requirements on what a system could do with the expression. It would, when the standard was written, have been perfectly reasonable for it to to cause the following sequence of events at runtime:
Code loads a null pointer into the addressing unit
Code asks the addressing unit to add the offset of field b.
The addressing unit trigger a trap when attempting to add an integer to a null pointer (which should for robustness be a run-time trap, even though many systems don't catch it)
The system starts executing essentially random code after being dispatched through a trap vector that was never set because code to set it would have wasted been a waste of memory, as addressing traps shouldn't occur.
The very essence of what Undefined Behavior meant at the time.
Note that most of the compilers that have appeared since the early days of C would regard the address of a member of an object located at a constant address as being a compile-time constant, but I don't think such behavior was mandated then, nor has anything been added to the standard which would mandate that compile-time address calculations involving null pointers be defined in cases where run-time calculations would not.
No. Let's take this apart:
&(((struct name *)NULL)->b);
is the same as:
struct name * ptr = NULL;
&(ptr->b);
The first line is obviously valid and well defined.
In the second line, we calculate the address of a field relative to the address 0x0 which is perfectly legal as well. The Amiga, for example, had the pointer to the kernel in the address 0x4. So you could use a method like this to call kernel functions.
In fact, the same approach is used on the C macro offsetof (wikipedia):
#define offsetof(st, m) ((size_t)(&((st *)0)->m))
So the confusion here revolves around the fact that NULL pointers are scary. But from a compiler and standard point of view, the expression is legal in C (C++ is a different beast since you can overload the & operator).
This question already has answers here:
Why does sizeof(x++) not increment x?
(10 answers)
Closed 7 years ago.
The C code likes this:
#include <stdio.h>
#include <unistd.h>
#define DIM(a) (sizeof(a)/sizeof(a[0]))
struct obj
{
int a[1];
};
int main()
{
struct obj *p = NULL;
printf("%d\n",DIM(p->a));
return 0;
}
This object pointer p is NULL, so, i think this p->a is illegal.
But i have tested this code in Ubuntu14.04, it can execute correctly. So, I want to know why...
Note: the original code had int a[0] above but I've changed that to int a[1] since everyone seems to be hung up on that rather than the actual question, which is:
Is the expression sizeof(p->a) valid when p is equal to NULL?
Because sizeof is a compile time construction, it does not depend on evaluating the input. sizeof(p->a) gets evaluated based on the declared type of the member p::a solely, and becomes a constant in the executable. So the fact that p points to null makes no difference.
The runtime value of p plays absolutely no role in the expression sizeof(p->a).
In C and C++, sizeof is an operator and not a function. It can be applied to either a type-id or an expression. Except in the case that of an expression and the expression is a variable-length array (new in C99) (as pointed out by paxdiablo), the expression is an unevaluated operand and the result is the same as if you had taken sizeof against the type of that expression instead. (C.f. C11 references due to paxdiablo below, C++14 working draft 5.3.3.1)
First up, if you want truly portable code, you shouldn't be attempting to create an array of size zero1, as you did in your original question, now fixed. But, since it's not really relevant to your question of whether sizeof(p->a) is valid when p == NULL, we can ignore it for now.
From C11 section 6.5.3.4 The sizeof and _Alignof operators (my bold):
2/ The sizeof operator yields the size (in bytes) of its operand, which may be an expression or the parenthesized name of a type. The size is determined from the type of the operand. The result is an integer. If the type of the operand is a variable length array type, the operand is evaluated; otherwise, the operand is not evaluated and the result is an integer constant.
Therefore no evaluation of the operand is done unless it's a variable length array (which your example is not). Only the type itself is used to figure out the size.
1 For the language lawyers out there, C11 states in 6.7.6.2 Array declarators (my bold):
1/ In addition to optional type qualifiers and the keyword static, the [ and ] may delimit an expression or *. If they delimit an expression (which specifies the size of an array), the expression shall have an integer type. If the expression is a constant expression, it shall have a value greater than zero.
However, since that's in the constraints section (where shall and shall not do not involve undefined behaviour), it simply means the program itself is not strictly conforming. It's still covered by the standard itself.
This code contains a constraint violation in ISO C because of:
struct obj
{
int a[0];
};
Zero-sized arrays are not permitted anywhere. Therefore the C standard does not define the behaviour of this program (although there seems to be some debate about that).
The code can only "run correctly" if your compiler implements a non-standard extension to allow zero-sized arrays.
Extensions must be documented (C11 4/8), so hopefully your compiler's documentation defines its behaviour for struct obj (a zero-sized struct?) and the value of sizeof p->a, and whether or not sizeof evaluates its operand when the operand denotes a zero-sized array.
sizeof() doesn't care a thing about the content of anything, it merely looks at the resulting type of the expression.
Since C99 and variable length arrays, it is computed at run time when a variable length array is part of the expression in the sizeof operand.Otherwise, the operand is not evaluated and the result is an integer constant
Zero-size array declarations within structs was never permitted by any C standard, but some older compilers allowed it before it became standard for compilers to allow incomplete array declarations with empty brackets(flexible array members).
To my function i get a void pointer, I would like to point to the next location considering the incoming pointer is of char type.
int doSomething( void * somePtr )
{
((char*)somePtr)++; // Gives Compilation error
}
I get the following compilation error:
Error[Pe137]: expression must be a modifiable lvalue
Is this an issue with the priority of operators?
A cast does not yield an lvalue (see section 6.5.4 footnote 104 of C11 standard), therefore you can't apply post increment ++ operator to its result.
c-faq: 4.5:
In C, a cast operator does not mean "pretend these bits have a different type, and treat them accordingly"; it is a conversion operator, and by definition it yields an rvalue, which cannot be assigned to, or incremented with ++. (It is either an accident or a deliberate but nonstandard extension if a particular compiler accepts expressions such as the above.)
Try this instead
char *charPtr = ((char*)somePtr);
charPtr++;
If you want to move the pointer to next then you can use:
*ptr++;
If you want to Change copy the pointer position to another variable then:
char *abc = (char*)(def + 1);
It really depends on your motive to do things
The following compiles and prints "string" as an output.
#include <stdio.h>
struct S { int x; char c[7]; };
struct S bar() {
struct S s = {42, "string"};
return s;
}
int main()
{
printf("%s", bar().c);
}
Apparently this seems to invokes an undefined behavior according to
C99 6.5.2.2/5 If an attempt is made to modify the result of a function
call or to access it after the next sequence point, the behavior is
undefined.
I don't understand where it says about "next sequence point". What's going on here?
You've run into a subtle corner of the language.
An expression of array type is, in most contexts, implicitly converted to a pointer to the first element of the array object. The exceptions, none of which apply here, are:
When the array expression is the operand of a unary & operator (which yields the address of the entire array);
When it's the operand of a unary sizeof or (as of C11) _Alignof operator (sizeof arr yields the size of the array, not the size of a pointer); and
When it's a string literal in an initializer used to initialize an array object (char str[6] = "hello"; doesn't convert "hello" to a char*.)
(The N1570 draft incorrectly adds _Alignof to the list of exceptions. In fact, for reasons that are not clear, _Alignof can only be applied to a type name, not to an expression.)
Note that there's an implicit assumption: that the array expression refers to an array object in the first place. In most cases, it does (the simplest case is when the array expression is the name of a declared array object) -- but in this one case, there is no array object.
If a function returns a struct, the struct result is returned by value. In this case, the struct contains an array, giving us an array value with no corresponding array object, at least logically. So the array expression bar().c decays to a pointer to the first element of ... er, um, ... an array object that doesn't exist.
The 2011 ISO C standard addresses this by introducing "temporary lifetime", which applies only to "A non-lvalue expression with structure or union type, where the structure or union
contains a member with array type" (N1570 6.2.4p8). Such an object may not be modified, and its lifetime ends at the end of the containing full expression or full declarator.
So as of C2011, your program's behavior is well defined. The printf call gets a pointer to the first element of an array that's part of a struct object with temporary lifetime; that object continues to exist until the printf call finishes.
But as of C99, the behavior is undefined -- not necessarily because of the clause you quote (as far as I can tell, there is no intervening sequence point), but because C99 doesn't define the array object that would be necessary for the printf to work.
If your goal is to get this program to work, rather than to understand why it might fail, you can store the result of the function call in an explicit object:
const struct s result = bar();
printf("%s", result.c);
Now you have a struct object with automatic, rather than temporary, storage duration, so it exists during and after the execution of the printf call.
The sequence point occurs at the end of the full expression- i.e., when printf returns in this example. There are other cases where sequence points occur
Effectively, this rule states that function temporaries do not live beyond the next sequence point- which in this case, occurs well after it's use, so your program has quite well-defined behaviour.
Here's a simple example of not well-defined behaviour:
char* c = bar().c; *c = 5; // UB
Here, the sequence point is met after c is created, and the memory it points to is destroyed, but we then attempt to access c, resulting in UB.
In C99 there is a sequence point at the call to a function, after the arguments have been evaluated (C99 6.5.2.2/10).
So, when bar().c is evaluated, it results in a pointer to the first element in the char c[7] array in the struct returned by bar(). However, that pointer gets copied into an argument (a nameless argument as it happens) to printf(), and by the time the call is actually made to the printf() function the sequence point mentioned above has occurred, so the member that the pointer was pointing to may no longer be alive.
As Keith Thomson mentions, C11 (and C++) make stronger guarantees about the lifetime of temporaries, so the behavior under those standards would not be undefined.