Casting struct * to int * to be able to write into first field - c

I've recently found this page:
Making PyObject_HEAD conform to standard C
and I'm curious about this paragraph:
Standard C has one specific exception to its aliasing rules precisely designed to support the case of Python: a value of a struct type may also be accessed through a pointer to the first field. E.g. if a struct starts with an int , the struct * may also be cast to an int * , allowing to write int values into the first field.
So I wrote this code to check with my compilers:
struct with_int {
int a;
char b;
};
int main(void)
{
struct with_int *i = malloc(sizeof(struct with_int));
i->a = 5;
((int *)&i)->a = 8;
}
but I'm getting error: request for member 'a' in something not a struct or union.
Did I get the above paragraph right? If no, what am I doing wrong?
Also, if someone knows where C standard is referring to this rule, please point it out here. Thanks.

Your interpretation1 is correct, but the code isn't.
The pointer i already points to the object, and thus to the first element, so you only need to cast it to the correct type:
int* n = ( int* )i;
then you simply dereference it:
*n = 345;
Or in one step:
*( int* )i = 345;
1 (Quoted from: ISO:IEC 9899:201X 6.7.2.1 Structure and union specifiers 15)
Within a structure object, the non-bit-field members and the units in which bit-fields
reside have addresses that increase in the order in which they are declared. A pointer to a
structure object, suitably converted, points to its initial member (or if that member is a
bit-field, then to the unit in which it resides), and vice versa. There may be unnamed
padding within a structure object, but not at its beginning.

You have a few issues, but this works for me:
#include <malloc.h>
#include <stdio.h>
struct with_int {
int a;
char b;
};
int main(void)
{
struct with_int *i = (struct with_int *)malloc(sizeof(struct with_int));
i->a = 5;
*(int *)i = 8;
printf("%d\n", i->a);
}
Output is:
8

Like other answers have pointed out, I think you meant:
// Interpret (struct with_int *) as (int *), then
// dereference it to assign the value 8.
*((int *) i) = 8;
and not:
((int *) &i)->a = 8;
However, none of the answers explain specifically why that error makes sense.
Let me explain what ((int *) &i)->a means:
i is a variable that holds an address to a (struct with_int). &i is the address on main() function's stack space. This means &i is an address, that contains an address to a (struct with_int). In other words, &i is a pointer to a pointer to (struct with_int). Then the cast (int *) of this would tell the compiler to interpret this stack address as an int pointer, that is, address of an int. Finally, with that ->a, you are asking the compiler to fetch the struct member a from this int pointer and then assign the value 8 to it. It doesn't make sense to fetch a struct member from an int pointer. Hence, you get error: request for member 'a' in something not a struct or union.
Hope this helps.

Related

Why does this pointer declaration work with malloc()?

I am new to C and have a question about a specific pointer declaration:
Here is a program i wrote:
#include <stdlib.h>
struct n {
int x;
int y;
};
int main()
{
struct n **p = malloc(sizeof(struct n));
return 0;
}
The declaration here is not be correct but why not?
Here is my thought process:
The man page of malloc specifies that it returns a pointer:
The malloc() and calloc() functions return a pointer to the
allocated memory, which is suitably aligned for any built-in
type.
The type of p is struct n** aka a pointer to another pointer.
But shouldn't this declaration work in theory because:
malloc returns type struct n* (a pointer)
and p points to the pointer that malloc returns
so it is essentially a pointer to another pointer
so the type of p is fulfilled
Sorry if this is a dumb question but i am genuinely confused about why this does not work. Thanks for any help in advance.
The return type of malloc is not struct n *, regardless of how it is called. The return type of malloc is void *.
Initializing a struct n ** object with a value of type void * implicitly converts it to struct n **. This implicit conversion is allowed, because the rules for initialization follow the rules for assignment in C 2018 6.5.16.1 1, which state one of the allowed assignments is:
… the left operand has atomic, qualified, or unqualified pointer type, and (considering the type the left operand would have after lvalue conversion) one operand is a pointer to an object type, and the other is a pointer to a qualified or unqualified version of void, and the type pointed to by the left has all the qualifiers of the type pointed to by the right;…
and p points to the pointer that malloc returns
No, the value p is initialized to the value that malloc returns. Then p points to the memory malloc allocated.
It is a mistake for this code to allocate enough space for a struct n (using malloc(sizeof(struct n))) but assign the address of that space to the struct n ** that is p. To point to a struct n, use struct n *p = malloc(sizeof (struct n)); or, preferably, struct n *p = malloc(sizeof *p);.
To pointer to a pointer to a struct n, first create some pointer to a struct n, as with the above struct n *p = malloc(sizeof *p);. Then a pointer to that pointer would be struct n **pp = &p;.
If you wanted to allocate space for those pointers-to-pointers, you could do it with struct n **pp = malloc(sizeof *pp);, after which you could fill in the pointer to the struct n with *pp = malloc(sizeof **pp);. However, you should not add this additional layer of allocation without good reason.
Note that the form MyPointer = malloc(sizeof *MyPointer); is often preferred to MyPointer = malloc(sizeof (SomeType)); because the former automatically uses the type that MyPointer points to. The latter is prone to errors, such as somebody misintepreting the type of MyPointer and not setting SomeType correctly or somebody later changing the declaration of MyPointer but omitting to make the corresponding change to SomeType in the malloc call.
This doesn't really work. In this example, malloc is returning a void *, which points to a newly allocated place in the heap that is large enough to hold a struct n.
Note that malloc() returns type void *, which is basically a pointer to any potential type, malloc() DOES NOT return type struct n * (a pointer to your declared struct type). void * has the special property of being able to be cast to any other type of pointer, and back again.
All together this means that your code doesn't actually work, since the first dereference of your struct n** would be a struct n, not a struct n*. If you tried to dereference twice, you would most likely get an invalid memory reference and crash.
The reason that your code can compile and run without crashing is because:
The compiler automatically casts void * to struct n **
You never attempt to actually dereference your pointer, which means you never attempt an invalid memory reference.
Simplify the problem and understanding may come:
int main()
{
struct n data;
struct n *pData = &data; // skipped in your version
struct n **ppData = &pData;
// struct n **ppData = &data; // Will not compile
// struct n **ppData = (struct n **)&data; // Casting but wrong!
return 0;
}
Because malloc() returns a void ptr, you are free to store the pointer into anything that is a pointer datatype. On your head if you put it into the wrong datatype...

Cast 0 to struct pointer

So, I have this code here:
struct mystruct {
char a;
union {
char a[8];
char b[16];
} u;
};
void fuu(void)
{
struct mystruct s;
printf("%ld %ld\n", sizeof(s), &((struct mystruct *)0)->u.b);
}
The snippet that confuses me is:
&((struct mystruct *)0)->u.b
As far as I can understand it, firstly pointer to struct is casted to lvalue int of 0(what?)
Then, u.b is taken out of this pointer(which should be a pointer to the start of char array b)
Then address of this pointer is taken and printed to the screen.
The most confusing moment of all this is cast to 0.
Can someone explain in detail what is happening in this snippet?
Let's break it down a bit. Assume that we have a variable x that is a pointer to a mystruct:
struct mystruct *x;
And replace that in the expression in question:
&(x->u.b)
This takes the address of the member u.b in the struct that x points to. We can guess that this address should be slightly higher than x itself, because u is not the first member in the struct.
Then, add the fact that x is zero:
struct mystruct *x = (struct mystruct *)0;
Then, the value of the expression above will be slightly higher than 0. Or in other words, it will be the offset of u.b within the memory layout of the struct.
In fact, at least on my machine it's 1, because the only member in the struct before u is char a, which takes up 1 byte.*
Another way to do this is to use the offsetof macro defined in stddef.h:
#include <stddef.h>
...
printf("%ld %ld\n", sizeof(s), offsetof(struct mystruct, u.b));
It has the same effect, but it might be easier to understand what the code is trying to do.
* And none of the members of the union have greater alignment requirements. Try changing either of the members of the union from char to int - what happens?
The offset is printed as 4 instead of 1, because an int needs to be stored in an address divisible by 4. So the struct will contain one byte for char a, three unused bytes for alignment, and then the union.

Limitation of converting pointer to one type to pointer to another type

I'm experiencing some troubles with understanding convertation of "pointer to" types. Let me provide some examples:
struct test{
int x;
int y;
};
1.
void *ptr = malloc(sizeof(int));
struct test *test_ptr = ptr; //OK 7.22.3(p1)
int x = test_ptr -> x; //UB 6.2.6.1(p4)
2.
void *ptr = malloc(sizeof(struct test) + 1);
struct test *test_ptr = ptr + 1; //UB 6.3.2.3(p7)
3.
void *ptr = malloc(sizeof(struct test) + 1);
struct test *test_ptr = ptr; //OK 7.22.3(p1)
int x = test_ptr -> x; //Unspecified behavior or also UB?
My understaing of the cases:
The pointer convertation returned by malloc is ok by itself as 7.22.3(p1):
The pointer returned if the allocation succeeds is suitably aligned so
that it may be assigned to a pointer to any type of object with a
fundamental alignment requirement
The accessing is incorrect because the test_ptr cannot point to a valid struct test_ptr object since its size is less then the one allocated with malloc causing UB as explained at 6.2.6.1(p4).
This is UB since we cannot say anything about alignment of ptr + 1 pointer. 6.3.2.3(p7) explains this:
A pointer to an object type may be converted to a pointer to a
different object type. If the resulting pointer is not correctly
aligned68) for the referenced type, the behavior is undefined.
How is case 3 explained in the Standard?
It is unspecified in the standard (at least I could not find) if it is valid to convert a pointer to an object with no declared type to a pointer to an object whose size is less then the one allocated object has? (I'm not considering the array allocation here like malloc(10 * sizeof(struct test)); which is clearly explained at 7.22.3(p1)). 6.2.6.1(p4) states:
Values stored in non-bit-field objects of any other object type
consist of n × CHAR_BIT bits, where n is the size of an object of that
type, in bytes.
The allocated object does not consist of sizeof(struct test) x CHAR_BIT bits, but (sizeof(struct test) + 1) x CHAR_BIT
This has to be legal because in C we have flexible array members.
typedef struct flex_s {
int x;
int arr[];
} flex_t;
void *ptr = malloc(sizeof(flex_t) + sizeof(int));
flex_t *flex = ptr;
flex->arr[0]; // legal
So, if you want an answer from the standard, look at its definition of flexible array members and their allocation, and the rule will be given.
You can start by taking a look at example 20 of page 114 of the free draft of C11.

Arrow Operator Usage in Linked List [duplicate]

I am reading a book called "Teach Yourself C in 21 Days" (I have already learned Java and C# so I am moving at a much faster pace). I was reading the chapter on pointers and the -> (arrow) operator came up without explanation. I think that it is used to call members and functions (like the equivalent of the . (dot) operator, but for pointers instead of members). But I am not entirely sure.
Could I please get an explanation and a code sample?
foo->bar is equivalent to (*foo).bar, i.e. it gets the member called bar from the struct that foo points to.
Yes, that's it.
It's just the dot version when you want to access elements of a struct/class that is a pointer instead of a reference.
struct foo
{
int x;
float y;
};
struct foo var;
struct foo* pvar;
pvar = malloc(sizeof(struct foo));
var.x = 5;
(&var)->y = 14.3;
pvar->y = 22.4;
(*pvar).x = 6;
That's it!
I'd just add to the answers the "why?".
. is standard member access operator that has a higher precedence than * pointer operator.
When you are trying to access a struct's internals and you wrote it as *foo.bar then the compiler would think to want a 'bar' element of 'foo' (which is an address in memory) and obviously that mere address does not have any members.
Thus you need to ask the compiler to first dereference whith (*foo) and then access the member element: (*foo).bar, which is a bit clumsy to write so the good folks have come up with a shorthand version: foo->bar which is sort of member access by pointer operator.
a->b is just short for (*a).b in every way (same for functions: a->b() is short for (*a).b()).
foo->bar is only shorthand for (*foo).bar. That's all there is to it.
Well I have to add something as well. Structure is a bit different than array because array is a pointer and structure is not. So be careful!
Lets say I write this useless piece of code:
#include <stdio.h>
typedef struct{
int km;
int kph;
int kg;
} car;
int main(void){
car audi = {12000, 230, 760};
car *ptr = &audi;
}
Here pointer ptr points to the address (!) of the structure variable audi but beside address structure also has a chunk of data (!)! The first member of the chunk of data has the same address than structure itself and you can get it's data by only dereferencing a pointer like this *ptr (no braces).
But If you want to acess any other member than the first one, you have to add a designator like .km, .kph, .kg which are nothing more than offsets to the base address of the chunk of data...
But because of the preceedence you can't write *ptr.kg as access operator . is evaluated before dereference operator * and you would get *(ptr.kg) which is not possible as pointer has no members! And compiler knows this and will therefore issue an error e.g.:
error: ‘ptr’ is a pointer; did you mean to use ‘->’?
printf("%d\n", *ptr.km);
Instead you use this (*ptr).kg and you force compiler to 1st dereference the pointer and enable acess to the chunk of data and 2nd you add an offset (designator) to choose the member.
Check this image I made:
But if you would have nested members this syntax would become unreadable and therefore -> was introduced. I think readability is the only justifiable reason for using it as this ptr->kg is much easier to write than (*ptr).kg.
Now let us write this differently so that you see the connection more clearly. (*ptr).kg ⟹ (*&audi).kg ⟹ audi.kg. Here I first used the fact that ptr is an "address of audi" i.e. &audi and fact that "reference" & and "dereference" * operators cancel eachother out.
struct Node {
int i;
int j;
};
struct Node a, *p = &a;
Here the to access the values of i and j we can use the variable a and the pointer p as follows: a.i, (*p).i and p->i are all the same.
Here . is a "Direct Selector" and -> is an "Indirect Selector".
I had to make a small change to Jack's program to get it to run. After declaring the struct pointer pvar, point it to the address of var. I found this solution on page 242 of Stephen Kochan's Programming in C.
#include <stdio.h>
int main()
{
struct foo
{
int x;
float y;
};
struct foo var;
struct foo* pvar;
pvar = &var;
var.x = 5;
(&var)->y = 14.3;
printf("%i - %.02f\n", var.x, (&var)->y);
pvar->x = 6;
pvar->y = 22.4;
printf("%i - %.02f\n", pvar->x, pvar->y);
return 0;
}
Run this in vim with the following command:
:!gcc -o var var.c && ./var
Will output:
5 - 14.30
6 - 22.40
#include<stdio.h>
int main()
{
struct foo
{
int x;
float y;
} var1;
struct foo var;
struct foo* pvar;
pvar = &var1;
/* if pvar = &var; it directly
takes values stored in var, and if give
new > values like pvar->x = 6; pvar->y = 22.4;
it modifies the values of var
object..so better to give new reference. */
var.x = 5;
(&var)->y = 14.3;
printf("%i - %.02f\n", var.x, (&var)->y);
pvar->x = 6;
pvar->y = 22.4;
printf("%i - %.02f\n", pvar->x, pvar->y);
return 0;
}
The -> operator makes the code more readable than the * operator in some situations.
Such as: (quoted from the EDK II project)
typedef
EFI_STATUS
(EFIAPI *EFI_BLOCK_READ)(
IN EFI_BLOCK_IO_PROTOCOL *This,
IN UINT32 MediaId,
IN EFI_LBA Lba,
IN UINTN BufferSize,
OUT VOID *Buffer
);
struct _EFI_BLOCK_IO_PROTOCOL {
///
/// The revision to which the block IO interface adheres. All future
/// revisions must be backwards compatible. If a future version is not
/// back wards compatible, it is not the same GUID.
///
UINT64 Revision;
///
/// Pointer to the EFI_BLOCK_IO_MEDIA data for this device.
///
EFI_BLOCK_IO_MEDIA *Media;
EFI_BLOCK_RESET Reset;
EFI_BLOCK_READ ReadBlocks;
EFI_BLOCK_WRITE WriteBlocks;
EFI_BLOCK_FLUSH FlushBlocks;
};
The _EFI_BLOCK_IO_PROTOCOL struct contains 4 function pointer members.
Suppose you have a variable struct _EFI_BLOCK_IO_PROTOCOL * pStruct, and you want to use the good old * operator to call it's member function pointer. You will end up with code like this:
(*pStruct).ReadBlocks(...arguments...)
But with the -> operator, you can write like this:
pStruct->ReadBlocks(...arguments...).
Which looks better?
#include<stdio.h>
struct examp{
int number;
};
struct examp a,*b=&a;`enter code here`
main()
{
a.number=5;
/* a.number,b->number,(*b).number produces same output. b->number is mostly used in linked list*/
printf("%d \n %d \n %d",a.number,b->number,(*b).number);
}
output is 5
5 5
Dot is a dereference operator and used to connect the structure variable for a particular record of structure.
Eg :
struct student
{
int s.no;
Char name [];
int age;
} s1,s2;
main()
{
s1.name;
s2.name;
}
In such way we can use a dot operator to access the structure variable

Cast struct to array?

I'm currently learning C and I have trouble understanding the following code:
struct dns_header
{
unsigned char ra : 1;
unsigned char z : 1;
unsigned char ad : 1;
unsigned char cd : 1;
unsigned char rcode : 4;
unsigned short q_count : 16;
};
int main(void)
{
struct dns_header *ptr;
unsigned char buffer[256];
ptr = (struct dns_header *) &buffer;
ptr->ra = 0;
ptr->z = 0;
ptr->ad = 0;
ptr->cd = 0;
ptr->rcode = 0;
ptr->q_count = htons(1);
}
The line I don't understand is ptr = (struct dns_header *) &buffer;
Can anyone explain this in detail?
Your buffer is simply a contiguous array of raw bytes. They have no semantic from the buffer point of view: you cannot do something like buffer->ra = 1.
However, from a struct dns_header * point of view those bytes would become meaningful. What you are doing with ptr = (struct dns_header *) &buffer; is mapping your pointer to your data.
ptr will now points on the beginning of your array of data. It means that when you write a value (ptr->ra = 0), you are actually modifying byte 0 from buffer.
You are casting the view of a struct dns_header pointer of your buffer array.
The buffer is just serving as an area of memory -- that it's an array of characters is unimportant to this code; it could be an array of any other type, as long as it were the correct size.
The struct defines how you're using that memory -- as a bitfield, it presents that with extreme specificity.
That said, presumably you're sending this structure out over the network -- the code that does the network IO probably expects to be passed a buffer that's in the form of a character array, because that's intrinsically the sanest option -- network IO being done in terms of sending bytes.
Suppose you want to allocate space for the struct so you could
ptr = malloc(sizeof(struct dns_header));
which will return a pointer to the allocated memory,
ptr = (struct dns_header *) &buffer;
is almost the same, except that in this case it's allocated in the stack, and it's not necessary to take the address of the array, it can be
ptr = (struct dns_header *) &buffer[0];
or just
ptr = (struct dns_header *) buffer;
there is no problem in that though, because the addresses will be the same.
The line I don't understand is ptr = (struct dns_header *) &buffer;
You take the address of the array and pretend like it is a pointer to a dns_header. It is basically raw memory access, which is unsafe, but OK if you know what you are doing. Doing so will grant you access to write a dns_header in the beginning of the array.
Ideally, it should be an array of dns_headers not a byte array. You have to be cautious about the fact that dns_header contains bit fields, the implementation of which is not enforced by the standard, it is entirely up to the compiler vendors. Although bit field implementations are fairly "sane", there is no guarantee, so the size of a byte array might actually be mismatched with your intent.
Adding to the other answers posted:
This code is illegal since ANSI C. ptr->q_count = htons(1); violates the strict aliasing rule.
It is only permitted to use an unsigned short lvalue (i.e. the expression ptr->q_count) to access memory that either has no declared type (e.g. malloc'd space), or has declared type of short or unsigned short or compatible.
To use this code as-is, you should pass -fno-strict-aliasing to gcc or clang. Other compilers may or may not have a similar flag.
An improved version of the same code (which also has some forwards-compatibility to the structure size changing) is:
struct dns_header d = { 0 };
d.q_count = htons(1);
unsigned char *buffer = (unsigned char *)&d;
This is legal because the strict aliasing rule permits unsigned char to alias anything.
Note that buffer is currently unused in this code. If your code is actually a smaller snippet of larger code then buffer may have to be defined differently. In any case, it could be in a union with d.
A struct directly references a contiguous block of memory and each field within a struct is located at a certain fixed offset from the start. Variables can then be accessed via a struct pointer or by the struct declared name which returns the same address.
Here we declare a packed struct which references a contiguous block of memory:
#pragma pack(push, 1)
struct my_struct
{
unsigned char b0;
unsigned char b1;
unsigned char b2;
unsigned char b3;
unsigned char b4;
};
#pragma pack(pop)
Pointers can then be used to refer to the struct by its address. See this example:
int main(void)
{
struct my_struct *ptr;
unsigned char buffer[5];
ptr = (struct my_struct *) buffer;
ptr->b0 = 'h';
ptr->b1 = 'e';
ptr->b2 = 'l';
ptr->b3 = 'l';
ptr->b4 = 'o';
for (int i = 0; i < 5; i++)
{
putchar(buffer[i]); // Print "hello"
}
return 0;
}
Here we explicitly map 1:1 the struct contiguous block of memory to the contiguous block of memory pointed by buffer (using the address to the first element).
An array address and the name of the address are numerically identical but have different types. These two lines are thus equivalent:
ptr = (struct my_struct *) buffer;
ptr = (struct my_struct *) &buffer;
This is usually not a problem if we use the address as is and cast it appropriately. Dereferencing an array address of type pointer to array-of-type yields the same pointer but with a different type array-of-type.
Although it might seem convenient to manipulate memory in this fashion, it is strongly discouraged as the resulting code becomes painfully difficult to understand. If you really have no choice, I suggest using an union to specify that the struct is to be used in a particular manner.

Resources