Why does C have both . and -> for addressing struct members? [duplicate] - c

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Why does the arrow (->) operator in C exist?
Why does C have both . and -> for addressing struct members?
Is it possible to have such modified language syntax, where we can take p as a pointer to struct and get a struct member's value just as p.value?

You can think of p->m as shorthand for (*p).m

From the C99 Spec.
The first operand of the . operator shall have a qualified or unqualified structure or union
type, and the second operand shall name a member of that type.
The first operand of the -> operator shall have type pointer to qualified or unqualified
structure or pointer to qualified or unqualified union, and the second operand shall
name a member of the type pointed to.
My guess is, for identification purpose they used two operators for member access. i.e for pointer type struct variable is -> and . for ordinary struct variable.
For example:
struct sample E, *E1;
the expression (&E)->MOS is the same as E.MOS and (*E1).MOS is the same as E1->MOS

Is it possible? Yes. The syntax is as follows:
(*ptr).member
The parentheses are required because the structure member operator . has higher precedence than the indirection operator *. But after using that a few times you will agree that the following is easier to use:
ptr->member
Why does C have both? Pointers to structures are used so often in C that a special operator was created, called the structure pointer operator ->. It's job is to more clearly and conveniently express pointers to structures.

. is for struct variable, and -> is for pointer. If p is a pointer, you can do p->value or (*p).value, they are same.

Related

C: does the assignment operator deep copy?

For scalar values, the assignment operator seems to copy the right-hand side value to the left. How does that work for composite data types? Eg, if I have a nested struct
struct inner {
int b;
};
struct outer {
struct inner a;
};
int main() {
struct outer s1 = { .a = {.b=1}};
struct outer s2 = s1;
}
does the assignment recursively deep copy the values?
does the same happen when passing the struct to a function?
By experimenting it seems like it does, but can anyone point to the specification of the behavior?
There is no "recursion"; it copies all the (value) bits of the value. Pointers are not magically followed of course, the assignment operator wouldn't know how to duplicate the pointed-to data.
You can think of
a = b;
as shorthand for
memcpy(&a, &b, sizeof a);
The sizeof is misleading of course, since we know the types are the same on both sides but I don't think __typeof__ helps.
The draft C11 spec says (in 6.5.16.1 Simple assignment, paragraph 2):
In simple assignment (=), the value of the right operand is converted to the
type of the assignment expression and replaces the value stored in the object
designated by the left operand.
does the assignment recursively deep copy the values?
Yes, just as if you would have used memcpy. Pointers are copied, but not what they point at. The term "deep copy" often means: also copy what the pointers point at (for example in a C++ copy constructor).
Except the values of any padding bytes may hold indeterminate values. (Meaning that memcmp on a struct might be unsafe.)
does the same happen when passing the struct to a function?
Yes. See the reference to 6.5.2.2 below.
By experimenting it seems like it does, but can anyone point to the specification of the behavior?
C17 6.5.16:
An assignment operator stores a value in the object designated by the left operand. An
assignment expression has the value of the left operand after the assignment, but is not
an lvalue. The type of an assignment expression is the type the left operand would have
after lvalue conversion.
(Lvalue conversion in this case isn't relevant, since both structs must be of 100% identical and compatible types. Simply put: two structs are compatible if they have exactly the same members.)
C17 6.5.16.1 Simple assignment:
the left operand has an atomic, qualified, or unqualified version of a structure or union
type compatible with the type of the right;
C17 6.5.2.2 Function calls, §7:
If the expression that denotes the called function has a type that does include a prototype,
the arguments are implicitly converted, as if by assignment, ...

Difference in calling structure variables [duplicate]

This question already has answers here:
Why does C have a distinction between -> and .?
(7 answers)
Closed 6 years ago.
What is the difference between -> and . while calling variables in a structure?I have seen both of them used in various scenarios but couldn't identify the difference between them.
-> means you have a variable that points to a piece of memory containing the struct. To access its members, you must dereference the pointer and then add the offset to the member. the -> does that for you.
the . means your variable is the structure and you only need to add the offset to the member.
As user Eliot B points out, if you have a pointer s to a struct, then accessing the member elem can be done in two ways: s->elem or (*s).elem.
With (*s) you have an expression that "is" the struct and you now use the dot-operator to access elem.
s->elem is equal to (*s).elem
https://en.wikipedia.org/wiki/Dereference_operator
The difference is about the structure's defined instance. '->' and '.' operators is always about the left operand.
If left operand is a pointer, then you use '->', else you use '.'.
For example.
struct Foo bar1;
struct Foo* bar2 = malloc(sizeof(struct Foo));
bar1.variable = "text";
bar2->variable = "text";
x->y (-> is the pointer to member operator) is equivalent to (*x).y. Because of operator precedence, you can't write *x.y as that would be evaluated as *(x.y).
The former is easier to type and is a lot clearer. It's used when x is a pointer to a structure containing the member y.

Why is not possible to assign a pointer to an array? [duplicate]

This question already has answers here:
Why can´t we assign a new string to an char array, but to a pointer?
(4 answers)
Closed 6 years ago.
In C, I'm coding this
char * real = strdup("GEORGE");
char one[1024];
one = real;
and it gives error:
invalid initializer
any suggestions?
is there any chance I can make array of chars equal to char pointer?
In your code, one is a variable of type array. Thus,
one = real;
is attempt to assign to an array type, which is not allowed.
To elaborate, array names are no modifiable lvalues and assignment operator only works on modifiable lvalues as the LHS operand.
Quoting C11, chapter §6.5.16
An assignment operator shall have a modifiable lvalue as its left operand.
and then, chapter §6.3.2.1, (emphais mine)
A modifiable lvalue is an lvalue that
does not have array type, does not have an incomplete type, does not have a const qualified
type, and if it is a structure or union, does not have any member (including,
recursively, any member or element of all contained aggregates or unions) with a const qualified
type.
You need to use strcpy() to copy the content to the array.
C requires constants in array initializers. You are allowed to do this:
char one[1024] = "GEORGE";
or this
char one[1024] = {'G','E','O','R','G','E'};
but assigning a pointer to an array is not allowed under any circumstances, initializer or not.
On the other hand, you can copy a content of a char pointer into an array. Depending on whether your source array is null-terminated or not, you can use strcpy or memcpy, like this:
strcpy(one, real);
or
memcpy(one, real, 7);

Why can this C code run correctly? [duplicate]

This question already has answers here:
Why does sizeof(x++) not increment x?
(10 answers)
Closed 7 years ago.
The C code likes this:
#include <stdio.h>
#include <unistd.h>
#define DIM(a) (sizeof(a)/sizeof(a[0]))
struct obj
{
int a[1];
};
int main()
{
struct obj *p = NULL;
printf("%d\n",DIM(p->a));
return 0;
}
This object pointer p is NULL, so, i think this p->a is illegal.
But i have tested this code in Ubuntu14.04, it can execute correctly. So, I want to know why...
Note: the original code had int a[0] above but I've changed that to int a[1] since everyone seems to be hung up on that rather than the actual question, which is:
Is the expression sizeof(p->a) valid when p is equal to NULL?
Because sizeof is a compile time construction, it does not depend on evaluating the input. sizeof(p->a) gets evaluated based on the declared type of the member p::a solely, and becomes a constant in the executable. So the fact that p points to null makes no difference.
The runtime value of p plays absolutely no role in the expression sizeof(p->a).
In C and C++, sizeof is an operator and not a function. It can be applied to either a type-id or an expression. Except in the case that of an expression and the expression is a variable-length array (new in C99) (as pointed out by paxdiablo), the expression is an unevaluated operand and the result is the same as if you had taken sizeof against the type of that expression instead. (C.f. C11 references due to paxdiablo below, C++14 working draft 5.3.3.1)
First up, if you want truly portable code, you shouldn't be attempting to create an array of size zero1, as you did in your original question, now fixed. But, since it's not really relevant to your question of whether sizeof(p->a) is valid when p == NULL, we can ignore it for now.
From C11 section 6.5.3.4 The sizeof and _Alignof operators (my bold):
2/ The sizeof operator yields the size (in bytes) of its operand, which may be an expression or the parenthesized name of a type. The size is determined from the type of the operand. The result is an integer. If the type of the operand is a variable length array type, the operand is evaluated; otherwise, the operand is not evaluated and the result is an integer constant.
Therefore no evaluation of the operand is done unless it's a variable length array (which your example is not). Only the type itself is used to figure out the size.
1 For the language lawyers out there, C11 states in 6.7.6.2 Array declarators (my bold):
1/ In addition to optional type qualifiers and the keyword static, the [ and ] may delimit an expression or *. If they delimit an expression (which specifies the size of an array), the expression shall have an integer type. If the expression is a constant expression, it shall have a value greater than zero.
However, since that's in the constraints section (where shall and shall not do not involve undefined behaviour), it simply means the program itself is not strictly conforming. It's still covered by the standard itself.
This code contains a constraint violation in ISO C because of:
struct obj
{
int a[0];
};
Zero-sized arrays are not permitted anywhere. Therefore the C standard does not define the behaviour of this program (although there seems to be some debate about that).
The code can only "run correctly" if your compiler implements a non-standard extension to allow zero-sized arrays.
Extensions must be documented (C11 4/8), so hopefully your compiler's documentation defines its behaviour for struct obj (a zero-sized struct?) and the value of sizeof p->a, and whether or not sizeof evaluates its operand when the operand denotes a zero-sized array.
sizeof() doesn't care a thing about the content of anything, it merely looks at the resulting type of the expression.
Since C99 and variable length arrays, it is computed at run time when a variable length array is part of the expression in the sizeof operand.Otherwise, the operand is not evaluated and the result is an integer constant
Zero-size array declarations within structs was never permitted by any C standard, but some older compilers allowed it before it became standard for compilers to allow incomplete array declarations with empty brackets(flexible array members).

Why does this implementation of offsetof() work?

In ANSI C, offsetof is defined as below.
#define offsetof(st, m) \
((size_t) ( (char *)&((st *)(0))->m - (char *)0 ))
Why won't this throw a segmentation fault since we are dereferencing a NULL pointer? Or is this some sort of compiler hack where it sees that only address of the offset is taken out, so it statically calculates the address without actually dereferencing it? Also is this code portable?
At no point in the above code is anything dereferenced. A dereference occurs when the * or -> is used on an address value to find referenced value. The only use of * above is in a type declaration for the purpose of casting.
The -> operator is used above but it's not used to access the value. Instead it's used to grab the address of the value. Here is a non-macro code sample that should make it a bit clearer
SomeType *pSomeType = GetTheValue();
int* pMember = &(pSomeType->SomeIntMember);
The second line does not actually cause a dereference (implementation dependent). It simply returns the address of SomeIntMember within the pSomeType value.
What you see is a lot of casting between arbitrary types and char pointers. The reason for char is that it's one of the only type (perhaps the only) type in the C89 standard which has an explicit size. The size is 1. By ensuring the size is one, the above code can do the evil magic of calculating the true offset of the value.
Although that is a typical implementation of offsetof, it is not mandated by the standard, which just says:
The following types and macros are defined in the standard header <stddef.h> [...]
offsetof(type,member-designator)
which expands to an integer constant expression that has type size_t, the value of
which is the offset in bytes, to the structure member (designated by member-designator),
from the beginning of its structure (designated by type). The type and member designator
shall be such that given
statictypet;
then the expression &(t.member-designator) evaluates to an address constant. (If the specified member is a bit-field, the behavior is undefined.)
Read P J Plauger's "The Standard C Library" for a discussion of it and the other items in <stddef.h> which are all border-line features that could (should?) be in the language proper, and which might require special compiler support.
It's of historic interest only, but I used an early ANSI C compiler on 386/IX (see, I told you of historic interest, circa 1990) that crashed on that version of offsetof but worked when I revised it to:
#define offsetof(st, m) ((size_t)((char *)&((st *)(1024))->m - (char *)1024))
That was a compiler bug of sorts, not least because the header was distributed with the compiler and didn't work.
In ANSI C, offsetof is NOT defined like that. One of the reasons it's not defined like that is that some environments will indeed throw null pointer exceptions, or crash in other ways. Hence, ANSI C leaves the implementation of offsetof( ) open to compiler builders.
The code shown above is typical for compilers/environments that do not actively check for NULL pointers, but fail only when bytes are read from a NULL pointer.
To answer the last part of the question, the code is not portable.
The result of subtracting two pointers is defined and portable only if the two pointers point to objects in the same array or point to one past the last object of the array (7.6.2 Additive Operators, H&S Fifth Edition)
Listing 1: A representative set of offsetof() macro definitions
// Keil 8051 compiler
#define offsetof(s,m) (size_t)&(((s *)0)->m)
// Microsoft x86 compiler (version 7)
#define offsetof(s,m) (size_t)(unsigned long)&(((s *)0)->m)
// Diab Coldfire compiler
#define offsetof(s,memb) ((size_t)((char *)&((s *)0)->memb-(char *)0))
typedef struct
{
int i;
float f;
char c;
} SFOO;
int main(void)
{
printf("Offset of 'f' is %zu\n", offsetof(SFOO, f));
}
The various operators within the macro are evaluated in an order such that the following steps are performed:
((s *)0) takes the integer zero and casts it as a pointer to s.
((s *)0)->m dereferences that pointer to point to structure member m.
&(((s *)0)->m) computes the address of m.
(size_t)&(((s *)0)->m) casts the result to an appropriate data type.
By definition, the structure itself resides at address 0. It follows that the address of the field pointed to (Step 3 above) must be the offset, in bytes, from the start of the structure.
It doesn't segfault because you're not dereferencing it. The pointer address is being used as a number that's subtracted from another number, not used to address memory operations.
It calculates the offset of the member m relative to the start address of the representation of an object of type st.
((st *)(0)) refers to a NULL pointer of type st *.
&((st *)(0))->m refers to the address of member m in this object. Since the start address of this object is 0 (NULL), the address of member m is exactly the offset.
char * conversion and the difference calculates the offset in bytes. According to pointer operations, when you make a difference between two pointers of type T *, the result is the number of objects of type T represented between the two addresses contained by the operands.
Quoting the C standard for the offsetof macro:
C standard, section 6.6, paragraph 9
An address constant is a null pointer, a pointer to an lvalue designating an object of static storage duration, or a pointer to a function designator; it shall be created explicitly using the unary & operator or an integer constant cast to pointer type, or implicitly by the use of an expression of array or function type. The array-subscript [] and member-access . and -> operators, the address & and indirection * unary operators, and pointer casts may be used in the creation of an address constant, but the value of an object shall not be accessed by use of these operators.
The macro is defined as
#define offsetof(type, member) ((size_t)&((type *)0)->member)
and the expression comprises the creation of an address constant.
Although genuinely speaking, the result is not an address constant because it does not point to an object of static storage duration. But this is still agreed upon that the value of an object shall not be accessed, so the integer constant cast to pointer type will not be dereferenced.
Also, consider this quote from the C standard:
C standard, section 7.19, paragraph 3
The type and member designator shall be such that given
static type t;
then the expression &(t.member-designator) evaluates to an address constant. (If the
specified member is a bit-field, the behavior is undefined.)
A struct in C is a composite data type (or record) declaration that defines a physically grouped list of variables under one name in a block of memory, allowing the different variables to be accessed via a single pointer or by the struct declared name which returns the same address.
From the compiler perspective, the struct declared name is an address and the member designator is an offset from that address.

Resources