Dereferencing a pointer to a struct to access its first member - c

For specific reasons i want to access only the first member of a struct by dereferencing the pointer to the struct.
I would like to know if is this legal or can it cause UB under some conditions; and what would be a correct solution, if this one has any problems.
Thank you.
#include <stdio.h>
#include <stdlib.h>
typedef struct test_s
{
void * data ;
struct test_s * next ;
} test_t ;
int main( void )
{
test_t * t = calloc( 1 , sizeof( test_t ) ) ;
int n = 123;
t->data = &n ; //int is used only for an address, this could be anything, an object for example
void ** v = ( void* )t ;
printf("Address of n: %p\nAddress of *t: %p\n\n" , &n , *v ) ; //dereference the pointer to struct to access its first member
return 0;
}

Yes, this is legal. From C99, 6.7.2.1.13:
A pointer to a structure object, suitably converted, points to its initial member (or if that member is a bit-field, then to the unit in which it resides), and vice versa. There may be unnamed padding within a structure object, but not at its beginning.

Yes, this is 100% legal: C standard specifies that the pointer to a struct must always equal to the pointer to the initial member of that struct.

Related

What is the difference between ptr->thing and *ptr->thing in C?

My understanding is that the -> operator is shorthand for dereferencing a pointer to a struct, and accessing the value of one struct member.
struct point {
int x;
int y;
};
struct point my_point = { 3, 7 };
struct point *p = &my_point; /* p is a pointer to my_point */
(*p).x = 8; /* set the first member of the struct */
p->x = 8; /* equivalent method to set the first member of the struct */
So the last 2 lines of the example above are equivalent. But I've encountered some code similar to this:
*p->x = 8
Using both the asterisk and arrow together. What does this do? Would this try to "double dereference" the pointer and assign to memory address 8, or something else? Maybe undefined behavior, or just a compiler error?
*p->x is equivalent to *(p->x) - you are dereferencing the result of p->x, which implies the x member itself has pointer type. Given a line like
*p->x = 8;
that implies x has type int *:
struct foo {
...
int *x;
...
};
Note that x must be assigned a valid pointer value before you can assign to *x. You can allocate memory dynamically:
struct foo *p = malloc( sizeof *p ); // dynamically allocate space for 1 instance of struct foo
if ( !p )
// handle allocation failure and exit here.
p->x = malloc( sizeof *p->x ); // dynamically allocate space for 1 int.
if ( !p->x )
// handle allocation failure and exit here.
Or you can set x to point to an existing int object:
int a;
...
p->x = &a;
*p->x = 8;
For a structure
struct tagp
{
int *x=someaddress;
}p0;
struct tagp *p=&p0;
*p->x accesses the address stored in the pointer x inside the structure. It is same as *((*p).x) and *(p0.x) which accesses the memory at Someaddress.
Check this link

Why is it dangerous?

I don't understand some topic from book Extreme C on page 300. It's about "multiple inheritance".
typedef struct { ... } a_t;
typedef struct { ... } b_t;
typedef struct {
a_t a;
b_t b;
...
} c_t;
c_t c_obj;
a_t* a_ptr = (a_ptr*)&c_obj;
b_t* b_ptr = (b_ptr*)&c_obj; //it's the problem
c_t* c_ptr = &c_obj;
Why we should do something like that??
c_t c_obj;
a_t* a_ptr = (a_ptr*)&c_obj;
b_t* b_ptr = (b_ptr*)(&c_obj + sizeof(a_t)); //?Is the address a_ptr the same as address c_obj?
c_t* c_ptr = &c_obj;
Thank you very much for all your help.
Why we should do...?
We should not do this!
First, sizeof a might be wrong to advance the address. If b_t as larger alignment requirements than a_t this will not yield the correct offset.
Second, the expression is wrong:
(&c_obj + sizeof(a_t))
This will take address of c_obj which has type c_t*. Then it will add a multiple of sizeof (c_t) which again points to object of type c_t but with an address that is illegal.
Third: Your casts are all wrong. You need to use name of a type, not a variable.
If you want to get address of b, there is a macro in C lib available offsetof:
offsetof(c_t,b)
evaluates to the offset in bytes of member b inside type c_t.
Then you can apply this to your address:
b_ptr=(b_t*) ((unsigned char*)&c_obj) + offsetof(c_t,b));
The first cast to unsigned char is required to use byte arithmetics.
Of course, there is a much simpler way to do this:
b_ptr=&c_obj.b;
Maybe the point of the book was to show that you cannot just use address of a struct and cast it to a pointer to another struct but you have to take care about the location of the members inside that struct. That is correct.
But the dirty details were a bit off.
Keep it simple:
c_t c_obj;
a_t* a_ptr = &c_obj.a;
b_t* b_ptr = &c_obj.b;
c_t* c_ptr = &c_obj;
The problem with:
b_t* b_ptr = (b_ptr*)&c_obj;//it's the problem
is that b_ptr ends up pointing to c_obj.a.
The problems with:
b_t* b_ptr = (b_ptr*)(&c_obj + sizeof(a_t));//?Is the address a_ptr the same as address c_obj?
are:
You are trying to adjust the pointer by sizeof(a_t) bytes, but pointer arithmetic is scaled by the size of the dereferenced type of the pointer. In this case, the pointer type is c_t* (from the expression &c_obj), and the dereferenced type is c_t, so the pointer is actually being adjusted by sizeof(c_t) * sizeof(a_t) bytes.
There may be padding after some of the members of c_t. In particular, there may be padding between the a and b members, so the b member may not be at the offset that you think it is. The offset of b from the start of c_t in bytes can be determined using the expression offsetof(c_t, b).
The address of an object of a structure type is equal to the address of the first member of the structure type.
So using your example
typedef struct {
a_t a;
b_t b;
...
} c_t;
c_t c_obj;
a_t* a_ptr = (a_ptr*)&c_obj;
then indeed the address of the data member a is equal to the address of the object c_obj.
However for the data member b this relation is broken because the data member b is not the first data member of the structure c_t.
As for this statement
b_t* b_ptr = (b_ptr*)(&c_obj + sizeof(a_t));
then it is entirely wrong. For starters in this sub-expression &c_obj + sizeof(a_t) there is used the pointer arithmetic and the value of the expression &c_obj is incremented by the value sizeof( c_t ) * sizeof( a_t ) .
It seems you mean
b_t* b_ptr = (b_ptr*)(( char * )&c_obj + sizeof(a_t));
However in any case the expression in the right side in general will not yield the address of the data member b due to a possible alignment,
Consider the following demonstrative program.
#include <stdio.h>
struct A
{
int x;
};
struct B
{
double y;
};
struct C
{
struct A a;
struct B b;
};
int main(void)
{
struct C c = { { 1 }, { 2.2 } };
printf( "&c.a = %p\n( char * )( &c.c ) + sizeof( struct A ) = %p\n&c.b = %p\n",
( void * )&c.a, ( void * ) ( ( char * )&c.a + sizeof( struct A ) ), ( void * )&c.b );
return 0;
}
Its output might look like
&c.a = 0x7ffe5e0697c0
( char * )( &c.c ) + sizeof( struct A ) = 0x7ffe5e0697c4
&c.b = 0x7ffe5e0697c8
As you see the data member a was appended with bytes to align the next data member b to double.

Casting struct * to int * to be able to write into first field

I've recently found this page:
Making PyObject_HEAD conform to standard C
and I'm curious about this paragraph:
Standard C has one specific exception to its aliasing rules precisely designed to support the case of Python: a value of a struct type may also be accessed through a pointer to the first field. E.g. if a struct starts with an int , the struct * may also be cast to an int * , allowing to write int values into the first field.
So I wrote this code to check with my compilers:
struct with_int {
int a;
char b;
};
int main(void)
{
struct with_int *i = malloc(sizeof(struct with_int));
i->a = 5;
((int *)&i)->a = 8;
}
but I'm getting error: request for member 'a' in something not a struct or union.
Did I get the above paragraph right? If no, what am I doing wrong?
Also, if someone knows where C standard is referring to this rule, please point it out here. Thanks.
Your interpretation1 is correct, but the code isn't.
The pointer i already points to the object, and thus to the first element, so you only need to cast it to the correct type:
int* n = ( int* )i;
then you simply dereference it:
*n = 345;
Or in one step:
*( int* )i = 345;
1 (Quoted from: ISO:IEC 9899:201X 6.7.2.1 Structure and union specifiers 15)
Within a structure object, the non-bit-field members and the units in which bit-fields
reside have addresses that increase in the order in which they are declared. A pointer to a
structure object, suitably converted, points to its initial member (or if that member is a
bit-field, then to the unit in which it resides), and vice versa. There may be unnamed
padding within a structure object, but not at its beginning.
You have a few issues, but this works for me:
#include <malloc.h>
#include <stdio.h>
struct with_int {
int a;
char b;
};
int main(void)
{
struct with_int *i = (struct with_int *)malloc(sizeof(struct with_int));
i->a = 5;
*(int *)i = 8;
printf("%d\n", i->a);
}
Output is:
8
Like other answers have pointed out, I think you meant:
// Interpret (struct with_int *) as (int *), then
// dereference it to assign the value 8.
*((int *) i) = 8;
and not:
((int *) &i)->a = 8;
However, none of the answers explain specifically why that error makes sense.
Let me explain what ((int *) &i)->a means:
i is a variable that holds an address to a (struct with_int). &i is the address on main() function's stack space. This means &i is an address, that contains an address to a (struct with_int). In other words, &i is a pointer to a pointer to (struct with_int). Then the cast (int *) of this would tell the compiler to interpret this stack address as an int pointer, that is, address of an int. Finally, with that ->a, you are asking the compiler to fetch the struct member a from this int pointer and then assign the value 8 to it. It doesn't make sense to fetch a struct member from an int pointer. Hence, you get error: request for member 'a' in something not a struct or union.
Hope this helps.

Is it vali to call free with a pointer to the first member?

Is it okay to call free on a pointer which is pointing at the first member of a struct (and the struct is the one involved with malloc)? I know in principle the pointer is pointing at the right thing anyway...
struct s {int x;};
//in main
struct s* test;
test = (struct s*) malloc(sizeof(*test));
int* test2;
test2 = &(test->x);
free(test2); //is this okay??
Also, will the answer change if int x is replaced with a struct?
Update: Why would I want to write code like this?
struct s {int x;};
struct sx1 {struct s test; int y;}; //extending struct s
struct sx2 {struct s test; int z;}; //another
// ** some functions to keep track of the number of references to each variable of type struct s
int release(struct s* ptr){
//if the number of references to *ptr is 0 call free on ptr
}
int main(){
struct sx1* test1;
struct sx2* test2;
test1 = (sx1*) malloc(sizeof(*sx1));
test2 = (sx2*) malloc(sizeof(*sx2));
//code that changes the number of references to test1 and test2, calling functions defined in **
release(test1);
release(test2);
}
Yes this is ok.
6.7.2.1
Within a structure object, the non-bit-field members and the units in which bit-fields
reside have addresses that increase in the order in which they are declared. A pointer to a
structure object, suitably converted, points to its initial member (or if that member is a
bit-field, then to the unit in which it resides), and vice versa. There may be unnamed
padding within a structure object, but not at its beginning.
Which means that this is defined:
struct s {int x;};
struct s* test;
test = (struct s*) malloc(sizeof(*test));
int* p = &(test->x);
free(p);
As per the C11 standard, chapter ยง6.7.2.1
[...] There may be unnamed
padding within a structure object, but not at its beginning.
which means there cannot be any padding at the beginning of a structure. So, the first member will have the same address as that of the structure variable.
free() needs a pointer which has been previously returned by malloc() or family.
In your case, you're passing the same address that malloc() had returned. So, you're good to go.

Converting a pointer to a struct to its first member

Consider the following example program:
#include <stdio.h>
struct base {
int a, b;
};
struct embedded {
struct base base;
int c, d;
};
struct pointed {
struct base* base;
int c, d;
};
static void base_print(struct base* x) {
printf("a: %d, b: %d\n", x->a, x->b);
}
static void tobase_embedded(void* object) {
base_print(object); // no cast needed, suitably converted into first member.
}
static void tobase_pointed(void* object) {
struct base* x = *(struct base**) object; // need this cast?
base_print(x);
}
int main(void) {
struct embedded em = {{4, 2}};
struct pointed pt = {&em.base};
tobase_embedded(&em);
tobase_pointed(&pt);
return 0;
}
Compiled with:
$ gcc -std=c99 -O2 -Wall -Werror -pedantic -o main main.c
The expected output is:
$ ./main
a: 4, b: 2
a: 4, b: 2
The C99 standard says this about the first member of a structure:
C99 6.7.2.1 (13):
A pointer to a structure object, suitably converted, points to its initial member... and vice versa.
There may be unnamed padding within as structure object, but not at its beginning.
In the example program a pointer to struct embedded is converted to a pointer to struct base (through void*) without the need for an explicit cast.
What if instead the first member is a pointer to base as in struct pointed? I'm unsure about the cast within tobase_pointed. Without the cast garbage is printed, but no compilation warnings/errors. With the cast the correct values for base.a and base.b are printed, but that doesn't really mean much if there is undefined behavior.
Is the cast to convert struct pointed into its first member struct base* correct?
The code doesn't just casts, it also dereferences the pointer to the pointer to struct base. This is necessary to obtain the pointer to base in the first place.
This is what happens in your code, if the function tobase_pointed was removed:
struct pointed pt = {&em.base};
void* object = &pt; //pass to the function
struct base** bs = object; //the cast in the function
assert( bs == (struct base**)&pt ) ; //bs points to the struct pointed
assert( bs == &(pt.base) ) ; //bs also points to the initial member struct base* base
struct base* b = *bs ; //the dereference in the function
base_print(x);
bs is the pointer that was suitably converted to point to the initial member. Your code is correct.
This cast is justified, and you need it because you want to convert a pointer into a pointer to pointer. If you do not cast, dereference will be incorrect.
In other words, your base* has the same address as pt object. So you can access it through a pointer to pt. But you have to dereference it.

Resources