Is it conforming with the standard to pack two objects using the align of the second object to get the final size?
I'm using this approach for a doubly linked list, but extracting the relevant part:
#include <stdio.h>
#include <stdlib.h>
#include <stdalign.h>
struct node
{
struct node *prev;
struct node *next;
};
#define get_node(data, szof) ((struct node *)(((char *)data) + szof))
int main(void)
{
size_t align = alignof(struct node);
double *data;
// Round size up to nearest multiple of alignof(struct node)
size_t szof = (sizeof(*data) + (align - 1)) / align * align;
// Pack `data` + `struct node`
data = malloc(szof + sizeof(struct node));
// Get node using a generic pointer to calculate the offset
struct node *node = get_node(data, szof);
*data = 3.14;
node->prev = NULL;
node->next = NULL;
printf("%f\n", *data);
free(data);
return 0;
}
Where data can be a pointer to any primitive or composite type.
Is it conforming with the standard to pack two objects using the align of the second object to get the final size?
Sure, the code presented is valid.
There's nothing really to write here as it's harder to prove, rather then disprove something. The pointer values are properly aligned for the referenced types, there are no uninitialized memory accesses. If you remember about alignment yourself, then you can write whole programs without ever using struct.
In real code, I advise to make a structure and let the compiler figure it out[1]. We have offsetof.
struct double_and_node {
double data;
struct node node;
};
void *pnt = malloc(sizeof(double_and_node));
double *data = (struct node*)((char*)pnt + offsetof(struct double_and_node, data));
struct node *node = (struct node*)((char*)pnt + offsetof(struct double_and_node, data));
I guess you could research container_of and see C11 6.3.2.3p7.
[1] but really, if so, just use the structure anyway...:
struct double_and_node *pnt = malloc(sizeof(double_and_node));
double *data = &pnt->data;
struct node *node = &pnt->node;
Well, this is complicated. The ((char *)data) + szof line is arguably invoking undefined behavior depending on alignof(struct node) vs sizeof(double), but it isn't very obvious.
First of all, lets assume that double* data is actually pointing at a double. We would then be allowed to inspect this object through a character type pointer, as per 6.3.2.3/7:
When a pointer to an object is converted to a pointer to a character type,
the result points to the lowest addressed byte of the object. Successive increments of the
result, up to the size of the object, yield pointers to the remaining bytes of the object.
So we may do ((char *)data) + szof while we stick inside the actual double. Otherwise, if we go out of bounds of that double, the above quoted special rule doesn't apply.
Rather, we are supposedly left to the rules of pointer arithmetic, specified by the additive operators. Although these rules expect you to be using the pointed-at type double* and not a char*. Those rules don't really specify what happens when you inspect a double through a char* and go beyond sizeof(double) bytes.
So the ((char *)data) + szof going beyond sizeof(double) is questionable - I think it is undefined behavior no matter how you put it.
Then there's another aspect here... what if the char pointer is pointing at something without a type? The C standard doesn't specify what will happen then. And this is actually what the code is doing.
Because as it happens, data = malloc(szof + sizeof(struct node)); allocates a raw segment with no declared nor "effective type". The 6.5 rules then states that
If a value is stored into an object having no declared type through an
lvalue having a type that is not a character type, then the type of the lvalue becomes the
effective type of the object for that access and for subsequent accesses that do not modify
the stored value
And you don't lvalue access the actual memory until *data = 3.14;, in which case the memory gets the effective type double. This happens after the pointer arithmetic.
As an extended idea of previous answers, something generic could be done with macros using typeof() and offsetof() defining/using a structure defined on the fly to concatenate a data type with the node structure:
#include <stdio.h>
#include <stddef.h>
struct node
{
struct node *prev;
struct node *next;
};
#define LINKED_TYPE_SIZE(data) \
sizeof(struct { typeof(data) f; struct node node; })
#define LINKED_TYPE_NODE(datap) \
(struct node *)((char *)(datap) + offsetof(struct { typeof(*(datap)) f; struct node node; }, node))
int main(void)
{
double v1;
printf("size of linked double = %zu\n", LINKED_TYPE_SIZE(v1));
printf("%p, %p\n", &v1, LINKED_TYPE_NODE(&v1));
int v2;
printf("size of linked int = %zu\n", LINKED_TYPE_SIZE(v2));
printf("%p, %p\n", &v2, LINKED_TYPE_NODE(&v2));
short int v3;
printf("size of linked short int = %zu\n", LINKED_TYPE_SIZE(v3));
printf("%p, %p\n", &v3, LINKED_TYPE_NODE(&v3));
struct foo {
int f1;
char f2;
int f3;
} foo_struct;
printf("size of linked foo = %zu\n", LINKED_TYPE_SIZE(foo_struct));
printf("%p, %p\n", &foo_struct, LINKED_TYPE_NODE(&foo_struct));
return 0;
}
The execution of the preceding gives the following on a x86_64 Linux desktop:
$ gcc try.c -o try
$ ./try
size of linked double = 24
0x7ffdfbdf50f8, 0x7ffdfbdf5100
size of linked int = 24
0x7ffdfbdf50f4, 0x7ffdfbdf50fc
size of linked short int = 24
0x7ffdfbdf50f2, 0x7ffdfbdf50fa
size of linked foo = 32
0x7ffdfbdf5100, 0x7ffdfbdf5110
N.B.:
As typeof() is a non standard function, it is also possible to get rid of it by passing explicitly the type of the data as a parameter to the macros:
#define LINKED_TYPE_SIZE(type) \
sizeof(struct { type f; struct node node; })
#define LINKED_TYPE_NODE(type, datap) \
(struct node *)((char *)(datap) + offsetof(struct { type f; struct node node; }, node))
Related
I define relative pointer to mean what Ginger Bill describes as Self-Relative Pointers:
... define the base [to which an offset will be applied] to be the memory address of the offset itself
For example, consider this struct:
struct house {
int32_t weight;
}
struct person {
int32_t age;
struct house* residence;
}
int32_t getPersonsHousesWeight(struct person* p) {
return p->residence->weight;
}
The relative-pointer implementation of the same thing in C that I think might work is:
struct house { ... } // same as before
struct person {
int32_t age;
int64_t residence; // an offset from the person's address in memory
}
int32_t getPersonsHousesWeight(struct person* p) {
return ((struct residence*)((char*)p + (p->residence)))->weight;
}
Assuming that alignment of everything is good (all 8 bytes), is this free of undefined behavior?
EDIT
#tstanisl has provided an excellent answer (which I've accepted) that thoroughly explains UB in the context of stack allocations. I am curious how allocation into a large slab of contiguous heap would impact this analysis. For example:
int foo(void) {
char* base = mmap(NULL,4096,PROT_WRITE | PROT_READ,-1,MAP_PRIVATE | MAP_ANONYMOUS);
// Omitting mmap error checking
struct person* myPerson = (struct person*)(base + 128);
struct house* myHouse = (struct house*)(base + 256);
int32_t delta = (char*)myHouse - (char*)myPerson;
// Does the computation of delta invoke UB?
}
Usually it is going to be UB.
The first case is when person and house belong to separate object.
In such a case it will be UB because the pointer arithmetics is performed outside of the object.
int foo(void) {
struct person p;
struct house h;
p.residence = (char*)&h - (char*)&p; // already UB
getPersonsHousesWeight(&p); // UB again
}
In practice it means that the compiler is not obligated to notice that objects accessed from a pointers constructed from &p can alias with object h because p and h are separete memory regions (aka objects).
When both objects are placed inside a larger object then the situation is a bit better. Though it still would be technical UB.
int foo(void) {
struct ph {
struct person p;
struct house h;
} ph;
ph.p.residence = (char*)&ph.h - (char*)&ph.p; // still UB
getPersonsHousesWeight(&ph.p); // UB again
}
It UB because pointer arithmetic is done outside the member object.
(char*)&ph.h - 1 is a pointer outside of ph.h.
Note, that this code will likely work pretty much everywhere.
Otherwise, heavily used container_of-like macros would not work breaking a lot of existing code including the Linux kernel.
To avoid UB the pointer must be constructed in a special way to avoid moving outside of the originating object.
Rather using &ph.h one should use (char*)&ph + offsetof(struct ph, h).
Similarly &ph.p should be replaced with (char*)&ph + offsetof(struct ph, p).
Now this code should be portable:
int foo(void) {
struct ph {
struct person p;
struct house h;
} ph;
struct person *p_ptr = (struct person*)((char*)&ph + offsetof(struct ph, p));
struct house *h_ptr = (struct house*) ((char*)&ph + offsetof(struct ph, h));
ph.p.residence = (char*)h_ptr - (char*)p_ptr;
getPersonsHousesWeight(p_ptr);
}
Though it is very obscure.
The interesting discussion on this topic can be found at link
I originally asked this question: Type Punning with Unions and Heap
And not wanting the question to keep evolving to the point that anyone reading in the future had no idea what the original question was, I have a spin off question.
After reading this site:
https://kristerw.blogspot.com/2016/05/type-based-aliasing-in-c.html
Near the bottom it talks about malloc'd memory. Is it safe to say that casting from one pointer type to another pointer type is safe when memory is on the heap?
Example:
#include <stdio.h>
#include <stdlib.h>
struct test1
{
int a;
char b;
};
struct test2
{
int c;
char d;
};
void printer(const struct test2* value);
int main()
{
struct test1* aQuickTest = malloc(sizeof(struct test1));
aQuickTest->a = 42;
aQuickTest->b = 'a';
printer((struct test2*)aQuickTest); //safe because memory was malloc'd???
return 0;
}
void printer(const struct test2* value)
{
printf("Int: %i Char: %c",value->c, value->d);
}
And guessing it might not be safe. What would be the proper way to do this with memcpy? I will attempt to write an example with a function of what might hopefully work?
struct test2* converter(struct test1* original);
int main()
{
struct test1* aQuickTest = malloc(sizeof(struct test1));
aQuickTest->a = 42;
aQuickTest->b = 'a';
struct test2* newStruct = converter(aQuickTest);
printer(newStruct);
return 0;
}
struct test2* converter(struct test1* original)
{
struct test2* temp;
memcpy(&temp, &original, sizeof(struct test2));
return temp;
}
void *pnt = malloc(sizeof(struct test1));
What type has the memory behind pnt pointer? No type. It is uninitialized (it's value is "indeterminate"). There is just "memory".
Then you do:
struct test1* aQuickTest = malloc(sizeof(struct test1));
You only cast the pointer. Nothing happens here. No assembly is generated. Reading uninitialized memory is undefined behavior tho, so you can't read from aQuickTest->a (yet). But you can assign:
aQuickTest->a = 1;
This writes to an object struct test1 in the memory. This is assignment. You can now read aQuickTest->a, ie. print it.
But the following
printf("%d", ((struct test2*)aQuickTest)->a);
is undefined behavior (although it will/should work). You access the underlying object (ie. struct test1) using a not matching pointer type struct test2*. This is called "strict alias violation". Dereferencing an object (ie. doing -> or *) using a handle of not compatible type results in undefined behavior. It does not matter that struct test1 and struct test2 "look the same". They are different type. The rule is in C11 standard 6.5p7.
In the first code snipped undefined behavior happens on inside printf("Int: %i Char: %c",value->c. The access value-> accesses the underlying memory using incompatible handle.
In the second code snipped the variable temp is only a pointer. Also original is a pointer. Doing memcpy(&temp, &original, sizeof(struct test2)); is invalid, because &temp writes into the temp pointer and &original writes into the original pointer. No to the memory behind pointers. As you write out of bounds into&temppointer and read of bounds from&originalpointer (because most probablysizeof(temp) < sizeof(struct test2)andsizeof(original) < sizeof(struct test2)`), undefined behavior happens.
Anyway even if it were:
struct test1* original = &(some valid struct test1 object).
struct test2 temp;
memcpy(&temp, original, sizeof(struct test2));
printf("%d", temp.a); // undefined behavior
accessing the memory behind temp variable is still invalid. As the original didn't had struct test2 object, it is still invalid. memcpy doesn't change the type of the object in memory.
Is it possible to find the size of item_t through the pointer?
typedef struct item
{
char x;
char y;
char life;
}item_t;
void main (void)
{
item_t test;
void *ptr = &test;
printf("%d\n",sizeof(ptr));
}
return: 8
Not if ptr is of type void* -- but it probably shouldn't be.
You can't dereference a void* pointer. You can convert it to some other pointer type and dereference the result of the conversion. That can sometimes be useful, but more often you should just define the pointer with the correct type in the first place.
If you want a pointer to an item_t object, use an item_t* pointer:
item_t test;
item_t *ptr = &test;
printf("%zu\n", sizeof(*ptr));
This will give you the size of a single item_t object, because that's the type that ptr points to. If ptr is uninitialized, or is a null pointer, you'll get the same result, because the operand of sizeof is not evaluated (with one exception that doesn't apply here). If ptr was initialized to point to the initial element of an array of item_t objects:
ptr = malloc(42 * sizeof *ptr);
sizeof *ptr will still only give you the size of one of them.
The sizeof operator is (usually) evaluated at compile time. It uses only information that's available to the compiler. No run-time calculation is performed. (The exception is an operand whose type is a variable-length array.)
The correct format for printing a value of type size_t (such as the result of sizeof) is %zu, not %d.
And void main(void) should be int main(void) (unless you have a very good reason to use a non-standard definition -- which you almost certainly don't). If a book told you to define main with a return type of void, get a better book; its author doesn't know C very well.
Short answer: no. Given only ptr, all you have is an address (answer by WhozCraig).
Longer answer: you can implement inheritance by having the first field in all your structs specify its size. For example:
struct something_that_has_size
{
size_t size;
};
struct item
{
size_t size;
char x;
char y;
char life;
};
struct item2
{
size_t size;
char x;
char y;
char z;
char life;
};
// Somewhere in your code
...
struct item *i1 = malloc(sizeof(struct item));
i1->size = sizeof(struct item); // you are telling yourself what the size is
struct item2 *i2 = malloc(sizeof(struct item2));
i2->size = sizeof(struct item2);
// Later in your code
void *ptr = ... // get a pointer somehow
size_t size = ((struct something_that_has_size*)ptr)->size; // here is your size
But instead of size, you should better record the type of your struct - it's more useful than just the size. This technique is called discriminated union.
You can only cast a void pointer to get what is behind in the correct type, you cannot dereference it directly.
#include <stdio.h>
#include <stdlib.h>
typedef struct item {
char x;
char y;
char life;
} item_t;
int main()
{
item_t test;
void *ptr = &test;
printf("%zu %zu\n", sizeof(*(item_t *) ptr), sizeof(item_t));
exit(EXIT_SUCCESS);;
}
But that is of not much use because you need to know the type in the first place and have won nothing.
TL;DR: no, not possible
I've recently found this page:
Making PyObject_HEAD conform to standard C
and I'm curious about this paragraph:
Standard C has one specific exception to its aliasing rules precisely designed to support the case of Python: a value of a struct type may also be accessed through a pointer to the first field. E.g. if a struct starts with an int , the struct * may also be cast to an int * , allowing to write int values into the first field.
So I wrote this code to check with my compilers:
struct with_int {
int a;
char b;
};
int main(void)
{
struct with_int *i = malloc(sizeof(struct with_int));
i->a = 5;
((int *)&i)->a = 8;
}
but I'm getting error: request for member 'a' in something not a struct or union.
Did I get the above paragraph right? If no, what am I doing wrong?
Also, if someone knows where C standard is referring to this rule, please point it out here. Thanks.
Your interpretation1 is correct, but the code isn't.
The pointer i already points to the object, and thus to the first element, so you only need to cast it to the correct type:
int* n = ( int* )i;
then you simply dereference it:
*n = 345;
Or in one step:
*( int* )i = 345;
1 (Quoted from: ISO:IEC 9899:201X 6.7.2.1 Structure and union specifiers 15)
Within a structure object, the non-bit-field members and the units in which bit-fields
reside have addresses that increase in the order in which they are declared. A pointer to a
structure object, suitably converted, points to its initial member (or if that member is a
bit-field, then to the unit in which it resides), and vice versa. There may be unnamed
padding within a structure object, but not at its beginning.
You have a few issues, but this works for me:
#include <malloc.h>
#include <stdio.h>
struct with_int {
int a;
char b;
};
int main(void)
{
struct with_int *i = (struct with_int *)malloc(sizeof(struct with_int));
i->a = 5;
*(int *)i = 8;
printf("%d\n", i->a);
}
Output is:
8
Like other answers have pointed out, I think you meant:
// Interpret (struct with_int *) as (int *), then
// dereference it to assign the value 8.
*((int *) i) = 8;
and not:
((int *) &i)->a = 8;
However, none of the answers explain specifically why that error makes sense.
Let me explain what ((int *) &i)->a means:
i is a variable that holds an address to a (struct with_int). &i is the address on main() function's stack space. This means &i is an address, that contains an address to a (struct with_int). In other words, &i is a pointer to a pointer to (struct with_int). Then the cast (int *) of this would tell the compiler to interpret this stack address as an int pointer, that is, address of an int. Finally, with that ->a, you are asking the compiler to fetch the struct member a from this int pointer and then assign the value 8 to it. It doesn't make sense to fetch a struct member from an int pointer. Hence, you get error: request for member 'a' in something not a struct or union.
Hope this helps.
Hey I am getting this error:
error: conversion to non-scalar type requested
Here are my structs:
typedef struct value_t value;
struct value{
void* x;
int y;
value* next;
value* prev;
};
typedef struct key_t key;
struct key{
int x;
value * values;
key* next;
key* prev;
};
Here is the code that is giving me problems:
struct key new_node = (struct key) calloc(1, sizeof(struct key));
struct key* curr_node = head;
new_node.key = new_key;
struct value head_value = (struct value) calloc(1, sizeof(struct value))
Am I not suppose to use calloc on structs? Also, I have a struct that I have created and then I want to set that to a pointer of that same struct type but getting an error. This is an example of what I am doing:
struct value x;
struct value* y = *x;
this gives me this error
error: invalid type argument of ‘unary *’
When I do y = x, I get this warning:
warning: assignment from incompatible pointer type
You are trying to assign a pointer expression (the return type of malloc() and friends is void*) to a struct type (struct new_node). That is nonsense. Also: the cast is not needed (and possibly dangerous, since it can hide errors)
struct key *new_node = calloc(1, sizeof *new_node);
the same problem with the other malloc() line:
struct value *head_value = calloc(1, sizeof *head_value);
More errors: You are omitting the 'struct' keyword (which is allowed in C++, but nonsense in C):
struct key{
int x;
struct value *values;
struct key *next;
struct key *prev;
};
UPDATE: using structs and pointers to struct.
struct key the_struct;
struct key other_struct;
struct key *the_pointer;
the_pointer = &other_struct; // a pointer should point to something
the_struct.x = 42;
the_pointer->x = the_struct.x;
/* a->b can be seen as shorthand for (*a).b :: */
(*thepointer).x = the_struct.x;
/* and for the pointer members :: */
the_struct.next = the_pointer;
the_pointer->next = malloc (sizeof *the_pointer->next);
I don't think you've correctly understood typedefs.
The common idiom with using typedefs for convenience naming is this:
struct foo {
int something;
};
typedef struct foo foo_t;
Then you use the type foo_t instead of the less convenient struct foo.
For convenience, you can combine the struct declaration and the typedef into one block:
typedef struct {
int something;
} foo_t;
This defines a foo_t just like the above.
The last token on the typedef line is the name you're assigning. I have no idea what the code you wrote is actually doing to your namespace, but I doubt it's what you want.
Now, as for the code itself: calloc returns a pointer, which means both your cast and your storage type should be struct key* (or, if you fix your naming, key_t). The correct line is struct key* new_node = (struct key*)calloc(1, sizeof(struct key));
For your second, independent, issue, the last line should be struct value* y = &x;. You want y to store the address of x, not the thing at address x. The error message indicates this - you are misusing the unary star operator to attempt to dereference a non-pointer variable.
struct key new_node = (struct key) calloc(1, sizeof(struct key));
calloc returns a pointer value (void *), which you are trying to convert and assign to an aggregate (IOW, non-scalar) type (struct key). To fix this, change the type of new_node to struct key * and rewrite your allocation as follows:
struct key *new_node = calloc(1, sizeof *new_node);
Two things to note. First of all, ditch the cast expression. malloc, calloc, and realloc all return void *, which can be assigned to any object pointer type without need for a cast1. In fact, the presence of a cast can potentially mask an error if you forget to include stdlib.h or otherwise don't have a declaration for malloc in scope2.
Secondly, note that I use the expression *new_node as the argument to sizeof, rather than (struct key). sizeof doesn't evaluate it's operator (unless it's a variable array type, which this isn't); it just computes the type of the expression. Since the type of the expression *new_node is struct key, sizeof will return the correct number of bytes to store that object. It can save some maintenance headaches if your code is structured like
T *foo;
... // more than a few lines of code
foo = malloc(sizeof (T))
and you change the type of foo in the declaration, but forget to update the malloc call.
Also, it's not clear what you're trying to accomplish with your typedefs and struct definitions. The code
typedef struct value_t value;
struct value{
void* x;
int y;
value* next;
value* prev;
};
isn't doing what you think it is. You're creating a typedef name value which is a synonym for an as-yet-undefined type struct value_t. This value type is different from the struct value type you create later (typedef names and struct tags live in different namespaces). Rewrite your structs to follow this model:
struct value_t {
void *x;
int y;
struct value_t *next;
struct value_t *prev;
};
typedef struct value_t value;
Also, life will be easier if you write your declarations so that the * is associated with the declarator, not the type specifier3. A declaration like T* p is parsed as though it were written T (*p). This will save you the embarrassment of writing int* a, b; and expecting both a and b to be pointers (b is just a regular int).
1 - This is one area where C and C++ differ; C++ does not allow implicit conversions between void * and other object pointer types, so if you compile this as C++ code, you'll get an error at compile time. Also, before the 1989 standard was adopted, the *alloc functions returned char *, so in those days a cast was required if you were assigning to a different pointer type. This should only be an issue if you're working on a very old system.
2 - Up until the 1999 standard, if the compiler saw a function call without a preceding declaration, it assumed the function returned an int (which is why you still occasionally see examples like
main()
{
...
}
in some tutorials; main is implicitly typed to return int. As of C99, this is no longer allowed). So if you forget to include stdlib.h and call calloc (and you're not compiling as C99), the compiler will assume the function returns an int and generate the machine code accordingly. If you leave the cast off, the compiler will issue a diagnostic to the effect that you're trying to assign an int value to a pointer, which is not allowed. If you leave the cast in, the code will compile but the pointer value may be munged at runtime (conversions of pointers to int and back to pointers again is not guaranteed to be meaningful).
3 - There are some rare instances, limited to C++, where the T* p style can make code a little more clear, but in general you're better off following the T *p style. Yes, that's a personal opinion, but one that's backed up by a non-trivial amount of experience.
calloc(3) returns a pointer to the memory it allocates.
struct key new_node = (struct key) calloc(1, sizeof(struct key));
should be
struct key* new_node = calloc(1, sizeof(struct key));
You should not assign a pointer to a non-pointer variable. Change new_node to be a pointer.
Also, to use the address of variable, you need &, not *, so change it to struct value* y = &x;
Edit: your typedefs are wrong too. reverse them.
For the second problem, you want to use an ampersand & instead of an astrisk "*`. An astrisk dereferences a pointer, an ampersand gives you the pointer from the value.