The K&R "The C Programming Language" 2nd edition says on page 131 given a
set of variables thus :
struct rect r, *rp = &r;
where :
struct rect {
struct point pt1;
struct point pt2;
};
and
struct point {
int x;
int y;
}
then these four expressions are equivalent :
r.p1.x
rp->pt1.x
(r.pt1).x
(rp->pt1).x
Earlier on that same page we see this :
p->member_of_structure
Which is described as "refers to the particular member".
I changed the hyphens to underscores to ensure we could not be confused with
a minus sign.
So great, I can see we have what I would refer to as a nested struct because
the struct rect contains within it a struct point.
Well what is the definition of rect were such that pt1 and pt2 were both pointers
to a struct point?
Here is where I hit my troubles with the following code bits :
typedef struct some_node {
struct timespec *tv;
struct some_node *next;
} some_node_t;
Clearly I will be making a linked list here and that is no problem.
What is a really big problem is this :
struct timespec some_tv;
clock_gettime( CLOCK_REALTIME, &some_tv )
/* set up the head node */
struct some_node *node_head =
( struct some_node * ) malloc( sizeof( some_node_t ) );
node_head->tv = calloc ( (size_t) 1, sizeof ( struct timespec ) );
That all works great and I get my node_head just fine. I even get my nested
struct timespec node_head->tv just fine.
What is a real problem is trying to figure out how to set that inner tv.sec
to the value in some_tv.sec like so :
((struct timespec *)(*node_head.tv)).tv_sec = some_tv.tv_sec;
I get an error :
line 22: error: left operand of "." must be struct/union object
So I am looking in the K&R and I see that the example in the book does not
have a pointer to a struct within the struct rect.
I have resorted to trial and error to get what I want but this is maddening.
I could create a temporary variable of type "struct timespec temp" and then
set temp = &node_head.tv ... but no ... that won't work. That would be
worse I think.
What am I missing here ?
The answer was trivial, of course, simply use foo->bar->here syntax.
Modify the code to drop the cast on the malloc and use the correct syntax :
/* set up the node list */
struct some_node *node_head =
calloc( (size_t) 1, sizeof( some_node_t ) );
node_head->tv = calloc ( (size_t) 1, sizeof ( struct timespec ) );
node_head->tv->tv_sec = some_tv.tv_sec;
node_head->tv->tv_nsec = some_tv.tv_nsec;
node_head->next = NULL;
The debugger confirms this :
(dbx) print *node_head
*node_head = {
tv = 0x1001453e0
next = (nil)
}
(dbx) print *node_head.tv
*node_head->tv = {
tv_sec = 1363127096
tv_nsec = 996499096
}
Works great. clearly, I need coffee. :-)
Isn't this sufficient?
node_head->tv->tv_sec
the other answers have it, but in case it's not clear, -> is the same as doing a "dereference" with a * yourself.
so
rp->pt1
(*rp).pt1
are equivalent.
which means that you have the rule of thumb "use -> when dealing with pointers".
so in the case you had,
node_head->tv->tv_sec
will do what you want, and is equivalent to
(*node_head).tv->tv_sec
(*(*node_head).tv).tv_sec
You should change (*node_head.tv) to (*node_head).tv (or node_head->tv as Oli wrote) as . has higher precedence over *.
An initial remark: your (struct timespec *) cast of tv is not necessary, since that is already how tv is declared.
As others have indicated, you really only need to use the ptr->field notation here: node_head->tv->tv_sec. The -> notation is a bit of syntactic sugar to avoid all the (*rp).(*pt1).x clumsiness which would otherwise be required.
You asked,
Well what is the definition of rect were such that pt1 and pt2 were
both pointers to a struct point?
in that case #teppic's solution is what you want, but one subtlety to keep in mind is where the actual memory for the data structure is being allocated: in your example, the actual memory allocation is in rect, and in #teppic's, it's point. This point, sometimes tricky for newcomers to C, matters, particularly when you allocate and free the memory many times in long-running programs (necessary to avoid memory leaks).
From your opening example, if you have instead:
struct rect {
struct point *pt1;
struct point *pt2;
};
struct point {
int x;
int y;
};
and you have
struct rect *rp;
Then to access x you just use:
rp->pt1->x
The compiler can figure out what's going on. You only need casting if you have void pointers and such.
Related
I define relative pointer to mean what Ginger Bill describes as Self-Relative Pointers:
... define the base [to which an offset will be applied] to be the memory address of the offset itself
For example, consider this struct:
struct house {
int32_t weight;
}
struct person {
int32_t age;
struct house* residence;
}
int32_t getPersonsHousesWeight(struct person* p) {
return p->residence->weight;
}
The relative-pointer implementation of the same thing in C that I think might work is:
struct house { ... } // same as before
struct person {
int32_t age;
int64_t residence; // an offset from the person's address in memory
}
int32_t getPersonsHousesWeight(struct person* p) {
return ((struct residence*)((char*)p + (p->residence)))->weight;
}
Assuming that alignment of everything is good (all 8 bytes), is this free of undefined behavior?
EDIT
#tstanisl has provided an excellent answer (which I've accepted) that thoroughly explains UB in the context of stack allocations. I am curious how allocation into a large slab of contiguous heap would impact this analysis. For example:
int foo(void) {
char* base = mmap(NULL,4096,PROT_WRITE | PROT_READ,-1,MAP_PRIVATE | MAP_ANONYMOUS);
// Omitting mmap error checking
struct person* myPerson = (struct person*)(base + 128);
struct house* myHouse = (struct house*)(base + 256);
int32_t delta = (char*)myHouse - (char*)myPerson;
// Does the computation of delta invoke UB?
}
Usually it is going to be UB.
The first case is when person and house belong to separate object.
In such a case it will be UB because the pointer arithmetics is performed outside of the object.
int foo(void) {
struct person p;
struct house h;
p.residence = (char*)&h - (char*)&p; // already UB
getPersonsHousesWeight(&p); // UB again
}
In practice it means that the compiler is not obligated to notice that objects accessed from a pointers constructed from &p can alias with object h because p and h are separete memory regions (aka objects).
When both objects are placed inside a larger object then the situation is a bit better. Though it still would be technical UB.
int foo(void) {
struct ph {
struct person p;
struct house h;
} ph;
ph.p.residence = (char*)&ph.h - (char*)&ph.p; // still UB
getPersonsHousesWeight(&ph.p); // UB again
}
It UB because pointer arithmetic is done outside the member object.
(char*)&ph.h - 1 is a pointer outside of ph.h.
Note, that this code will likely work pretty much everywhere.
Otherwise, heavily used container_of-like macros would not work breaking a lot of existing code including the Linux kernel.
To avoid UB the pointer must be constructed in a special way to avoid moving outside of the originating object.
Rather using &ph.h one should use (char*)&ph + offsetof(struct ph, h).
Similarly &ph.p should be replaced with (char*)&ph + offsetof(struct ph, p).
Now this code should be portable:
int foo(void) {
struct ph {
struct person p;
struct house h;
} ph;
struct person *p_ptr = (struct person*)((char*)&ph + offsetof(struct ph, p));
struct house *h_ptr = (struct house*) ((char*)&ph + offsetof(struct ph, h));
ph.p.residence = (char*)h_ptr - (char*)p_ptr;
getPersonsHousesWeight(p_ptr);
}
Though it is very obscure.
The interesting discussion on this topic can be found at link
I am reading a book called "Teach Yourself C in 21 Days" (I have already learned Java and C# so I am moving at a much faster pace). I was reading the chapter on pointers and the -> (arrow) operator came up without explanation. I think that it is used to call members and functions (like the equivalent of the . (dot) operator, but for pointers instead of members). But I am not entirely sure.
Could I please get an explanation and a code sample?
foo->bar is equivalent to (*foo).bar, i.e. it gets the member called bar from the struct that foo points to.
Yes, that's it.
It's just the dot version when you want to access elements of a struct/class that is a pointer instead of a reference.
struct foo
{
int x;
float y;
};
struct foo var;
struct foo* pvar;
pvar = malloc(sizeof(struct foo));
var.x = 5;
(&var)->y = 14.3;
pvar->y = 22.4;
(*pvar).x = 6;
That's it!
I'd just add to the answers the "why?".
. is standard member access operator that has a higher precedence than * pointer operator.
When you are trying to access a struct's internals and you wrote it as *foo.bar then the compiler would think to want a 'bar' element of 'foo' (which is an address in memory) and obviously that mere address does not have any members.
Thus you need to ask the compiler to first dereference whith (*foo) and then access the member element: (*foo).bar, which is a bit clumsy to write so the good folks have come up with a shorthand version: foo->bar which is sort of member access by pointer operator.
a->b is just short for (*a).b in every way (same for functions: a->b() is short for (*a).b()).
foo->bar is only shorthand for (*foo).bar. That's all there is to it.
Well I have to add something as well. Structure is a bit different than array because array is a pointer and structure is not. So be careful!
Lets say I write this useless piece of code:
#include <stdio.h>
typedef struct{
int km;
int kph;
int kg;
} car;
int main(void){
car audi = {12000, 230, 760};
car *ptr = &audi;
}
Here pointer ptr points to the address (!) of the structure variable audi but beside address structure also has a chunk of data (!)! The first member of the chunk of data has the same address than structure itself and you can get it's data by only dereferencing a pointer like this *ptr (no braces).
But If you want to acess any other member than the first one, you have to add a designator like .km, .kph, .kg which are nothing more than offsets to the base address of the chunk of data...
But because of the preceedence you can't write *ptr.kg as access operator . is evaluated before dereference operator * and you would get *(ptr.kg) which is not possible as pointer has no members! And compiler knows this and will therefore issue an error e.g.:
error: ‘ptr’ is a pointer; did you mean to use ‘->’?
printf("%d\n", *ptr.km);
Instead you use this (*ptr).kg and you force compiler to 1st dereference the pointer and enable acess to the chunk of data and 2nd you add an offset (designator) to choose the member.
Check this image I made:
But if you would have nested members this syntax would become unreadable and therefore -> was introduced. I think readability is the only justifiable reason for using it as this ptr->kg is much easier to write than (*ptr).kg.
Now let us write this differently so that you see the connection more clearly. (*ptr).kg ⟹ (*&audi).kg ⟹ audi.kg. Here I first used the fact that ptr is an "address of audi" i.e. &audi and fact that "reference" & and "dereference" * operators cancel eachother out.
struct Node {
int i;
int j;
};
struct Node a, *p = &a;
Here the to access the values of i and j we can use the variable a and the pointer p as follows: a.i, (*p).i and p->i are all the same.
Here . is a "Direct Selector" and -> is an "Indirect Selector".
I had to make a small change to Jack's program to get it to run. After declaring the struct pointer pvar, point it to the address of var. I found this solution on page 242 of Stephen Kochan's Programming in C.
#include <stdio.h>
int main()
{
struct foo
{
int x;
float y;
};
struct foo var;
struct foo* pvar;
pvar = &var;
var.x = 5;
(&var)->y = 14.3;
printf("%i - %.02f\n", var.x, (&var)->y);
pvar->x = 6;
pvar->y = 22.4;
printf("%i - %.02f\n", pvar->x, pvar->y);
return 0;
}
Run this in vim with the following command:
:!gcc -o var var.c && ./var
Will output:
5 - 14.30
6 - 22.40
#include<stdio.h>
int main()
{
struct foo
{
int x;
float y;
} var1;
struct foo var;
struct foo* pvar;
pvar = &var1;
/* if pvar = &var; it directly
takes values stored in var, and if give
new > values like pvar->x = 6; pvar->y = 22.4;
it modifies the values of var
object..so better to give new reference. */
var.x = 5;
(&var)->y = 14.3;
printf("%i - %.02f\n", var.x, (&var)->y);
pvar->x = 6;
pvar->y = 22.4;
printf("%i - %.02f\n", pvar->x, pvar->y);
return 0;
}
The -> operator makes the code more readable than the * operator in some situations.
Such as: (quoted from the EDK II project)
typedef
EFI_STATUS
(EFIAPI *EFI_BLOCK_READ)(
IN EFI_BLOCK_IO_PROTOCOL *This,
IN UINT32 MediaId,
IN EFI_LBA Lba,
IN UINTN BufferSize,
OUT VOID *Buffer
);
struct _EFI_BLOCK_IO_PROTOCOL {
///
/// The revision to which the block IO interface adheres. All future
/// revisions must be backwards compatible. If a future version is not
/// back wards compatible, it is not the same GUID.
///
UINT64 Revision;
///
/// Pointer to the EFI_BLOCK_IO_MEDIA data for this device.
///
EFI_BLOCK_IO_MEDIA *Media;
EFI_BLOCK_RESET Reset;
EFI_BLOCK_READ ReadBlocks;
EFI_BLOCK_WRITE WriteBlocks;
EFI_BLOCK_FLUSH FlushBlocks;
};
The _EFI_BLOCK_IO_PROTOCOL struct contains 4 function pointer members.
Suppose you have a variable struct _EFI_BLOCK_IO_PROTOCOL * pStruct, and you want to use the good old * operator to call it's member function pointer. You will end up with code like this:
(*pStruct).ReadBlocks(...arguments...)
But with the -> operator, you can write like this:
pStruct->ReadBlocks(...arguments...).
Which looks better?
#include<stdio.h>
struct examp{
int number;
};
struct examp a,*b=&a;`enter code here`
main()
{
a.number=5;
/* a.number,b->number,(*b).number produces same output. b->number is mostly used in linked list*/
printf("%d \n %d \n %d",a.number,b->number,(*b).number);
}
output is 5
5 5
Dot is a dereference operator and used to connect the structure variable for a particular record of structure.
Eg :
struct student
{
int s.no;
Char name [];
int age;
} s1,s2;
main()
{
s1.name;
s2.name;
}
In such way we can use a dot operator to access the structure variable
I'm working my way through the learn c the hard way book and have run into a few issues on Exercise 19. The author said that ex19 was intended for the learners to get to know the macro in c. I have no problem in understanding the concept of that, but I just don't understand everything else. I can't understand how the object prototype is created.
Especilly,what does the following sentense mean?
Since C puts the Room.proto field first, that means the el pointer is
really only pointing at enough of the block of memory to see a full
Object struct. It has no idea that it's even called proto.
the relevant code is this:
// this seems weird, but we can make a struct of one size,
// then point a different pointer at it to "cast" it
Object *el = calloc(1, size);
*el = proto;
can anyone tell me how on earth malloc/calloc exactly works? As far as i know, it just allocate the required number of memory and return the first address. If so, how can the computer know the data struct of the allocated memory? like in the code, after Room *arena = NEW(Room, "The arena, with the minotaur");,you can do this directly arena->bad_guy = NEW(Monster, "The evil minotaur"); how does the computer know there is a bad_guy??
what on earth is the content of *el after the above two statements(Object *el = calloc(1, size); and *el = proto;)?
Any help will be appreciated!!
the link to the exercise: http://c.learncodethehardway.org/book/ex19.html
calloc has the additional feature that it fills the allocated memory with zero bytes, whereas using the equivalent malloc call would require an additional step if all or some of the allocation needs to be zero initially.
In the code
arena->bad_guy = NEW(Monster, "The evil minotaur");
the compiler knows the layout of the struct because the access is through the arena variable, which is declared as a pointer to Room, which is presumably a typedef of a struct.
For the other part, the guarantee of ordering within structs allows a limited form of inheritance in composite structs, or extended structs.
struct A {
int x;
};
struct B {
int foo;
double baloney;
};
struct B (or a pointer to it) can be cast to a (pointer to a) struct A because they both begin with an int. Of course, if you cast the other way, the struct A must have been originally a struct B or access to the baloney field will be undefined. In other words, struct B essentially begins with a struct A.
This may be easier to see if I rewrite my example like this:
struct A {
int x;
};
struct B {
struct A foo;
double baloney;
};
Now you can get a struct A out of struct B in different ways.
struct A a;
struct B b;
a = b.foo; // regular member variable access
struct A *ap = &a;
struct B *bp = &b;
ap = (struct A *)bp; // cast the pointer
ap = & b.foo; // take a pointer from the member variable
ap = & bp->foo; // take a pointer from the member variable via a pointer
All it does is to alloc 1*size bytes. There's nothing magic with malloc/calloc. He is passing the sizeof(T) to the function through that NEW macro and putting it in Object_new's size parameter. So all the function knows is the size in bytes.
In the following code, "stk" is treated as if it is a pointer. But after looking at it from every angle for hours, I cannot for the life of me see how it is a pointer. Can someone please explain what I'm missing?
struct T {
int count;
struct elem {
void *x;
struct elem *link;
} *head;
};
T Stack_new(void) {
T stk;
NEW(stk);
stk->count = 0;
stk->head = NULL;
return stk;
}
My interpretation says that T is a struct, and therefore stk is a local, automatic variable containing a struct. It is not a pointer, but then it gets treated as a pointer, leaving me stuck in a WTF state.
More Background
This code is from a book called "C Interfaces and Implementations" by Hanson. He creates a library of abstract data types that expose an interface and hide the implementation. The stack is the first one he covers. I'm a long-time programmer just now digging into C, and apparently there's some way of parsing this syntax that I'm missing. Thanks.
In case it is relevant, here is the definition for NEW and the things that new calls:
#define NEW(p) ((p) = ALLOC((long)sizeof *(p)))
#define ALLOC(nbytes) \
Mem_alloc((nbytes), __FILE__, __LINE__)
extern void *Mem_alloc (long nbytes,
const char *file, int line);
In the snippet above, T stk will declare stk as a variable of type T. However type T isn't defined anywhere, and the code won't compile.
If it instead said struct T stk;, it would be declaring stk as a variable having type struct T. However, the attempts to dereference stk would be meaningless and the code would again fail to compile.
To make the example work, you could add something like,
typedef struct T *T
which defines type T to be a pointer to struct T. I would find this highly confusing though.
So the problem at hand is to convert a string of digits in the format YYYYMMDD to
a struct tm type member within some other structure. Truth is, I really only care
about getting a struct tm with reasonable values in it.
Consider the following struct :
typedef struct some_node {
char somestring[64];
char anotherstring[128];
struct tm date_one;
struct tm date_two;
int some_val;
struct some_node *prev;
struct some_node *next;
} some_node_t;
Inside there I have two members of type struct tm from the time.h header. Seems
very reasonable. Also there are pointer members in there to make a linked list
however that isn't the issue.
So I create the first node in my yet to be created linked list like so :
/* begin the linked list of some_node_t */
struct some_node *t_head =
calloc( (size_t) 1, sizeof( some_node_t ) );
if ( t_head == NULL ) {
/*Memory allocation fault */
printf ( " FAIL : Memory allocation fault at %s(%d)\n",
__FILE__, __LINE__ );
exit ( EXIT_FAILURE );
}
/* I used calloc above which zero fills memory so these
* next lines are not really needed. Better safe than sorry. */
t_head->some_val = 0;
t_head->prev = NULL;
t_head->next = NULL;
Then I can stuff char data into the two char members :
strcpy ( t_head->somestring, "birthday" );
strcpy ( t_head->anotherstring, "19981127" );
No problem there.
Messing with the conversion of a string to a struct tm seems reasonable within a
function as I have to do it twice perhaps.
Therefore I write this :
int timestr_to_tm ( struct tm **date_val, char *yyyymmdd ) {
/* assume 8 digits received in format YYYYMMDD */
int j, date_status = -1;
char yyyy[5]="0000";
char mm[3]="00";
char dd[3]="00";
/* copy over the year digits first */
for ( j=0; j<4; j++ )
yyyy[j]=yyyymmdd[j];
/* month digits */
mm[0]=yyyymmdd[4];
mm[1]=yyyymmdd[5];
/* day digits */
dd[0]=yyyymmdd[6];
dd[1]=yyyymmdd[7];
*(date_val)->tm_year = atoi(yyyy) - 1900;
*(date_val)->tm_mon = atoi(mm) - 1;
*(date_val)->tm_mday = atoi(dd);
*(date_val)->tm_hour = 0;
*(date_val)->tm_min = 0;
*(date_val)->tm_sec = 0;
*(date_val)->tm_isdst = -1;
return 0;
}
So my hope here is that I can pass a pointer to a pointer to the member date_one
within t_node to that function.
if ( timestr_to_tm ( &(t_node->date_one), "19981127" ) < 0 ) {
/* deal with a bad date conversion */
}
Well my compiler has a fit here. Claiming :
error: argument #1 is incompatible with prototype:
Perhaps I should have &t_head->date_one but I think that the pointer dereference
operator "->" takes precedence over the "address of" operator. Perhaps it is bad
policy to even attempt to pass a pointer to a member within a struct?
Even worse, within the function timestr_to_tm() I get :
error: left operand of "->" must be pointer to struct/union
in those lines where I try to assign values into the struct tm variable.
I tried all this without passing pointers and the process works however upon
return there is nothing in the struct tm member. So I am wondering, what am
I missing here ?
If you really want to pass a pointer to a pointer to a struct tm, you can, you just need to create a pointer variable to hold the pointer and pass a pointer to that:
struct tm *pointer = &t_node->date_one;
if ( timestr_to_tm ( &pointer, "19981127" ) < 0 ) {
...
The question is why? You don't need the extra level of indirection as you're not trying to change the pointer, you just want to fill in the struct tm within the node. So just use a single pointer:
int timestr_to_tm ( struct tm *date_val, char *yyyymmdd ) {
:
date_val->tm_year = atoi(yyyy) - 1900;
:
then you can call it with a simple pointer and don't need to create an extra pointer variable:
if ( timestr_to_tm ( &t_node->date_one, "19981127" ) < 0 ) {
...
I think that the pointer dereference operator -> takes precedence over the "address of" operator
It does, so does it over the dereference, * operator. So this, for example:
*(date_val)->tm_mday = atoi(dd);
should be
(*date_val)->tm_mday = atoi(dd);
But: Why would you do this? Why not pass a pointer to the struct tm, and use -> without one more level of indirection?
Since I'm too lazy to understand the question, I'll just add this unrelated example which hopefully helps to clear things up. This is very straightforward, if you use the correct syntax and level of indirection.
typedef struct _node node_t;
struct _node {
node_t* next;
int a;
};
void fill_in_int(int* pointer_to_int) {
*pointer_to_int = 42;
}
void populate_all_nodes(node_t* list, int a_value) {
node_t* node;
for (node=list; node; node = node->next) {
fill_in_int( &(node->a) ); // Pass pointer to member a in node
}
}