Use a base struct to extract values from void* - c

I have many types of structs in my project, and another struct that holds a pointer to one of these structs. Such as,
struct one{int num = 1;};
struct two{int num = 2;};
struct three{int num = 3;};
// These structs hold many other values as well, but the first value is always `int num`.
And I have another struct that holds references to these structs. I had to use void* because I do not know which of these structs is going to be referenced.
struct Holder{void* any_struct};
My question is, I need the values inside these structs, but I have a void pointer, could I declare a base struct that the first variable is an int, cast it, and use it to extract the num variable from these structs, such as:
struct Base{int num};
((Base*) Holder->any_struct)->num
// Gives 1, 2 or 3

If the only thing you need is to extract the num, you can use memcpy.
Assuming it's always int and always first and always present.
int num = 0;
memcpy(&num, Holder->any_struct, sizeof(int));
// Gives 1, 2 or 3 in num.
C99 standard section 6.7.2.1 bullet point 13:
A pointer to a structure object, suitably converted, points to its
initial member. There may be unnamed padding within a structure
object, but not at its beginning.
More info about the standard in this answer.

I think this is acceptable, and I've seen this pattern in other C projects. E.g., in libuv. They define a type uv_handle_t and refer to it as a "Base handle" ... here's info from their page (http://docs.libuv.org/en/v1.x/handle.html)
uv_handle_t is the base type for all libuv handle types.
Structures are aligned so that any libuv handle can be cast to
uv_handle_t. All API functions defined here work with any handle type.
And how they implement is pattern you could adopt. They define a macro for the common fields:
#define UV_HANDLE_FIELDS \
/* public */ \
void* data; \
/* read-only */ \
uv_loop_t* loop; \
uv_handle_type type; \
/* private */ \
uv_close_cb close_cb; \
void* handle_queue[2]; \
union { \
int fd; \
void* reserved[4]; \
} u; \
UV_HANDLE_PRIVATE_FIELDS \
/* The abstract base class of all handles. */
struct uv_handle_s {
UV_HANDLE_FIELDS
};
... and then they use this macro to define "derived" types:
/*
* uv_stream_t is a subclass of uv_handle_t.
*
* uv_stream is an abstract class.
*
* uv_stream_t is the parent class of uv_tcp_t, uv_pipe_t and uv_tty_t.
*/
struct uv_stream_s {
UV_HANDLE_FIELDS
UV_STREAM_FIELDS
};
The advantage of this approach is that you can add fields to the "base" class by updating this macro, and then be sure that all "derived" classes get the new fields.

First of all, the various rules of type conversions between different struct types in C are complex and not something one should meddle with unless one knows the rules of what makes two structs compatible, the strict aliasing rule, alignment issues and so on.
That being said, the simplest kind of base class interface is similar to what you have:
typedef struct
{
int num;
} base_t;
typedef struct
{
base_t base;
/* struct-specific stuff here */
} one_t;
one_t one = ...;
...
base_t* ref = (base_t*)&one;
ref->num = 0; // this is well-defined
In this code, the base_t* doesn't point directly at num but at the first object in the struct which is of base_t. It is fine to de-reference it because of that.
However, your original code with the int num spread over 3 structs doesn't necessarily allow you to cast from one struct type to another, even if you only access the initial member num. There's various details regarding strict aliasing and compatible types that may cause problems.

The construct you describe of using a pointer to a "base" structure as an alias to several "derived" structures, while often used with things like struct sockaddr, is not guaranteed to work by the C standard.
While there is some language to suggest is might be supported, particularly 6.7.2.1p15:
Within a structure object, the non-bit-field members and the
units in which bit-fields reside have addresses that increase in
the order in which they are declared. A pointer to a structure
object, suitably converted, points to its initial member (or
if that member is a bit-field, then to the unit in which it
resides), and vice versa. There may be unnamed padding within
a structure object, but not at its beginning.
Other parts suggest it is not, particularly 6.3.2.3 which discusses pointer conversions that are allowed:
1 A pointer to void may be converted to or from a pointer to any object type. A pointer toany object type may be converted to a
pointer to void and back again; the result shall compare equal
to the original pointer.
2 For any qualifier q, a pointer to a non-q-qualified type may be converted to a pointer to the q-qualified version of the type;
the values stored in the original and converted pointers shall compare
equal.
3 An integer constant expression with the value 0, or such an expression cast to type void *, is called anull pointer
constant. If a null pointer constant is converted to a pointer type,
the resulting pointer, called a null pointer, is guaranteed to compare
unequal to a pointer to any object or function.
4 Conversion of a null pointer to another pointer type yields a null pointer of that type.Any two null pointers shall
compare equal.
5 An integer may be converted to any pointer type. Except as previously specified, the result is implementation-defined,
might not be correctly aligned, might not point to an entity
of the referenced type, and might be a trap representation.
6 Any pointer type may be converted to an integer type. Except as previously specified, the result is implementation-defined. If the
result cannot be represented in the integer type,the behavior
is undefined. The result need not be in the range of values
of any integer type.
7 A pointer to an object type may be converted to a pointer to a different object type. If the resulting pointer is not correctly
aligned for the referenced type, the behavior is undefined.
Otherwise, when converted back again, the result shall compare
equal to the original pointer. When a pointer to an object is
converted to a pointer to a character type,the result points to the
lowest addressed byte of the object. Successive increments of the
result, up to the size of the object, yield pointers to the remaining
bytes of the object.
8 A pointer to a function of one type may be converted to a pointer to a function of another type and back again; the result shall compare
equal to the original pointer. If a converted pointer is used to call
a function whose type is not compatible with the referenced type,the
behavior is undefined.
From the above is does not state that casting from one struct to another where the type of the first member is the same is allowed.
What is allowed however is making use of a union to do essentially the same thing. Section 6.5.2.3p6 states:
One special guarantee is made in order to simplify the use of unions:
if a union contains several structures that share a common
initial sequence (see below), and if the union object
currently contains one of these structures, it is permitted
to inspect the common initial part of any of them anywhere that a
declaration of the completed type of the union is visible. Two
structures share a common initial sequence if corresponding
members have compatible types (and, for bit-fields, the same widths)
for a sequence of one or more initial members.
So what you can do is define a union that contains all the possible types as well as the base type:
union various {
struct base { int num; } b;
struct one { int num; int a; } s1;
struct two { int num; double b; } s2;
struct three { int num; char *c; } s3;
};
Then you use this union anyplace you need of of the three subtypes, and you can freely inspect the base member to determine the type. For example:
void foo(union various *u)
{
switch (u->b.num) {
case 1:
printf("s1.a=%d\n", u->s1.a);
break;
case 2:
printf("s2.b=%f\n", u->s2.b);
break;
case 1:
printf("s3.c=%s\n", u->s3.c);
break;
}
}
...
union various u;
u.s1.num = 1;
u.s1.a = 4;
foo(&u);
u.s2.num = 2;
u.s2.b = 2.5;
foo(&u);
u.s3.num = 3;
u.s3.c = "hello";
foo(&u);

Related

Why void pointer if pointers can be casted into any type(in c)?

I want to understand the real need of having a void pointer, for example in the following code, i use casting to be able to use the same ptr in different way, so why is there really a void pointer if anything can be casted?
int main()
{
int x = 0xAABBCCDD;
int * y = &x;
short * c = (short *)y;
char * d = (char*)y;
*c = 0;
printf("x is %x\n",x);//aabb0000
d +=2;
*d = 0;
printf("x is %x\n",x);//aa000000
return 0;
}
Converting any pointer type to any other pointer type is not supported by base C (that is, C without any extensions or behavior not required by the C standard). The 2018 C standard says in clause 6.3.2.3, paragraph 7:
A pointer to an object type may be converted to a pointer to a different object type. If the resulting pointer is not correctly aligned for the referenced type, the behavior is undefined. Otherwise, when converted back again, the result shall compare equal to the original pointer…
In that passage, we see two limitations:
If the pointer is not properly aligned, the conversion may fail in various ways. In your example, converting an int * to a short * is unlikely to fail since int typically has stricter alignment than short. However, the reverse conversion is not supported by base C. Say you define an array with short x[20]; or char x[20];. Then the array will be aligned as needed for a short or char, but not necessarily as needed for an int, in which case the behavior of (int *) x would not be defined by the C standard.
The value that results from the conversion mostly unspecified. This passage only guarantees that converting it back yields the original pointer (or something equivalent). It does not guarantee you can do anything useful with the pointer without converting it back—you cannot necessarily use a pointer converted from int * to access a short.
The standard does make some additional guarantees about certain pointer conversions. One of them is in the continuation of the passage above:
… When a pointer to an object is converted to a pointer to a character type, the result points to the lowest addressed byte of the object. Successive increments of the result, up to the size of the object, yield pointers to the remaining bytes of the object.
So you can use a pointer converted from int * to access the individual bytes that represent an int, and you can do the same to access the bytes of any other object type. But that guarantee is made only for access the individual bytes with a character type, not with a short type.
From the above, we know that after the short * c = (short *)y; in your example, y does not necessarily point to any part of the x it originated from—the value resulting from the pointer conversion is not guaranteed to work as a short * at all. But, even if it does point to the place where x is, base C does not support using c to access those bytes, because 6.5 7 says:
An object shall have its stored value accessed only by an lvalue expression that has one of the following types:
— a type compatible with the effective type of the object,
— a qualified version of a type compatible with the effective type of the object,
— a type that is the signed or unsigned type corresponding to the effective type of the object,
— a type that is the signed or unsigned type corresponding to a qualified version of the effective type of the object,
— an aggregate or union type that includes one of the aforementioned types among its members (including, recursively, a member of a subaggregate or contained union), or
— a character type.
So the *c = 0; in your example is not supported by C for two reasons: c does not necessarily point to any part of x or to any valid address, and, even if it does, the behavior of modifying part of the int x using short type is not defined by the C standard. It might appear to work in your C implementation, and it might even be supported by your C implementation, but it is not strictly conforming C code.
The C standard provides the void * type for use when a specific type is inadequate. 6.3.2.3 1 makes a similar guarantee for pointers to void as it does for pointers to objects:
A pointer to void may be converted to or from a pointer to any object type. A pointer to any object type may be converted to a pointer to void and back again; the result shall compare equal to the original pointer.
void * is used with routines that must work with arbitrary object types, such as qsort. char * could serve this purpose, but it is better to have a separate type that clearly denotes no specific type is associated with it. For example, if the parameter to a function were char *p, the function could inadvertently use *p and get a character that it does not want. If the parameter is void *p, then the function must convert the pointer to a specific type before using it to access an object. Thus having a special type for “generic pointers” can help avoid errors as well as indicate intent to people reading the code.
Why void pointer if pointers can be casted into any type(in c)?
C does not specify that void* can be cast into a pointer of any type. A void * may be cast into a pointer to any object type. IOWs, a void * may be insufficient to completely store a function pointer.
need of having a void pointer
A void * is a universal pointer for object types. Setting aside pointers to const, volatile, etc. concerns, functions like malloc(), memset() provide universal ways to allocate and move/set data.
In more novel architectures, a int * and void * and others have different sizes and interpretations. void* is the common pointer type for objects, complete enough to store information to re-constitute the original pointer, regardless of object type pointed to.

Cast struct pointer to another struct

This code snippet prints the value 5. I don't understand why.
#include <stdio.h>
struct A
{
int x;
};
struct B
{
struct A a;
int y;
};
void printA(struct A *a)
{
printf("A obj: %d\n", a->x);
}
int main(void)
{
struct B b = {
{
5
},
10
};
struct A *a = (struct A*)&b;
printA(a);
printf("Done.\n");
return 0;
}
When I create b, a pointer to it would point to the data { {5}, 10 }.
When I cast &b to struct A*, I'm assuring the compiler that this struct A* points to a struct of a single data element of data type int. Instead, I'm providing it a pointer to a struct of two data elements of data types struct A, int.
Even if the second variable is ignored (since struct A has only one data member) I am still providing it a struct whose member is of data type struct A, not int.
Thus, when I pass in a to printA, the line a->x is performed, essentially asking to access the first data element of a. The first data element of a is of data type struct A, which is a type mismatch due to the %d expecting a digit, not a struct A.
What exactly is happening here?
When I create b, a pointer to it would point to the data { {5}, 10 }.
Yes, in the sense of that being the text of a type-appropriate and value-correct C initializer. That text itself should not be taken literally as the value of the structure.
When I cast &b to struct A*, I'm assuring the compiler that this
struct A* points to a struct of a single data element of data type
int.
No, not exactly. You are converting the value of the expression &b to type struct A *. Whether the resulting pointer actually points to a struct A is a separate question.
Instead, I'm providing it a pointer to a struct of two data
elements of data types struct A, int.
No, not "instead". Given that struct B's first member is a struct A, and that C forbids padding before the first member of a structure, a pointer to a struct B also points to a struct A -- the B's first member -- in a general sense. As #EricPostpischi observed in comments, the C standard explicitly specifies the outcome in your particular case: given struct B b, converting a pointer to b to type struct A * yields a pointer to b's first member., a struct A.
Even if the second variable is ignored (since struct A has only one
data member) I am still providing it a struct whose member is of data
type struct A, not int.
The first sizeof(struct A) bytes of the representation of a struct B form the representation of its first member, a struct A. That the latter is a member of the former has no physical manifestation other than their overlap in memory.
Even if the language did not explicitly specify it, given your declaration of variable b as a struct B, there would be no practical reason to expect that the expression (struct A*)&b == &b.a would evaluate to false, and there can be no question that the right-hand pointer can be used to access a struct A.
Thus, when I pass in a to printA, the line a->x is performed,
essentially asking to access the first data element of a.
Yes, and this is where an assertion enters that a really does point to a struct A. Which it does in your case, as already discussed.
The first
data element of a is of data type struct A,
No. *a is by definition a struct A. Specifically, it is the struct A whose representation overlaps the beginning of the representation of b. If there were not such a struct A then the behavior would be undefined, but that's not an issue here. Like every struct A, it has a member, designated by x, that is an int.
which is a type mismatch
due to the %d expecting a digit, not a struct A.
You mean expecting an int. And that's what it gets. That's what the expression a->x reads, supposing the behavior is defined at all, because that is the type of that expression. Under different circumstances the behavior might indeed not be defined, but under no circumstance does that expression ever provide a struct A.
What exactly is happening here?
What seems to be happening is that you are imagining different, higher-level semantics than C actually provides. In particular, you seem to have a mental model of structures as lists of distinguishable member objects, and that's leading you to form incorrect expectations.
Perhaps you are more familiar with a weakly typed language such as Perl, or a dynamically typed language such as Python, but C works differently. You cannot look at a C object and usefully ask "what is your type"? Instead, you look at each and every object through the lens of the static type of the expression used to access it.
The language-lawyer explanation of why the code is fine:
Any pointer in C may be converted to any other pointer type. (C17 6.3.2 §7).
If it is safe to dereference the pointed-at object after conversion depends on: 1) if the types are compatible and thereby correctly aligned, and 2) if the respective pointer types used are allowed to alias.
As a special case, a pointer to a struct type is equivalent to a pointer to its first member. The relevant part of C17 6.7.2 §15 says:
A pointer to a structure object,
suitably converted, points to its initial member (or if that member is a bit-field, then to the unit in
which it resides), and vice versa.
This means that (struct A*)&b is fine. &b is suitably converted to the correct type.
There is no violation of "strict aliasing", since we fulfil C17 6.5 §7:
An object shall have its stored value accessed only by an lvalue expression that has one of the following types:
a type compatible with the effective type of the object,
...
an aggregate or union type that includes one of the aforementioned types among its members
The effective type of the initial member being struct A. The lvalue access that happens inside the print function is fine. struct B is also an aggregate type that includes struct A among its members, so strict aliasing violations are impossible, regardless of the initial member rule cited at the top.
There is a special rule in the C standard for this case. C 2011 6.7.2.1 15 says:
A pointer to a structure object, suitably converted, points to its initial member (or if that member is a bit-field, then to the unit in which it resides), and vice versa.

Casting structure pointers between structs containing pointers to different types?

I have a structure, defined by as follows:
struct vector
{
(TYPE) *items;
size_t nitems;
};
where type may literally be any type, and I have a type-agnostic structure of similar kind:
struct _vector_generic
{
void *items;
size_t nitems;
};
The second structure is used to pass structures of the first kind of any type to a resizing function, for example like this:
struct vector v;
vector_resize((_vector_generic*)&v, sizeof(*(v->items)), v->nitems + 1);
where vector_resize attempts to realloc memory for the given number of items in the vector.
int
vector_resize (struct _vector_generic *v, size_t item_size, size_t length)
{
void *new = realloc(v->items, item_size * length);
if (!new)
return -1;
v->items = new;
v->nitems = length;
return 0;
}
However, the C standard states that pointers to different types are not required to be of the same size.
6.2.5.27:
A pointer to void shall have the same representation and alignment
requirements as a pointer to a character type.39) Similarly, pointers
to qualified or unqualified versions of compatible types shall have
the same representation and alignment requirements. All pointers to
structure types shall have the same representation and alignment
requirements as each other. All pointers to union types shall have the
same representation and alignment requirements as each other. Pointers
to other types need not have the same representation or alignment
requirements.
Now my question is, should I be worried that this code may break on some architectures?
Can I fix this by reordering my structs such that the pointer type is at the end? for example:
struct vector
{
size_t nitems;
(TYPE) *items;
};
And if not, what can I do?
For reference of what I am trying to achieve, see:
https://github.com/andy-graprof/grapes/blob/master/grapes/vector.h
For example usage, see:
https://github.com/andy-graprof/grapes/blob/master/tests/grapes.tests/vector.exp
You code is undefined.
Accessing an object using an lvalue of an incompatible type results in undefined behavior.
Standard defines this in:
6.5 p7:
An object shall have its stored value accessed only by an lvalue expression that has one of
the following types:
— a type compatible with the effective type of the object,
— a qualified version of a type compatible with the effective type of the object,
— a type that is the signed or unsigned type corresponding to the effective type of the
object,
— a type that is the signed or unsigned type corresponding to a qualified version of the
effective type of the object,
— an aggregate or union type that includes one of the aforementioned types among its
members (including, recursively, a member of a subaggregate or contained union), or
— a character type.
struct vector and struct _vector_generic have incompatible types and do not fit into any of the above categories. Their internal representation is irrelevant in this case.
For example:
struct vector v;
_vector_generic* g = &v;
g->size = 123 ; //undefined!
The same goes for you example where you pass the address of the struct vector to the function and interpret it as a _vector_generic pointer.
The sizes and padding of the structs could also be different causing elements to be positioned at different offsets.
What you can do is use your generic struct, and cast if depending on the type the void pointer holds in the main code.
struct gen
{
void *items;
size_t nitems;
size_t nsize ;
};
struct gen* g = malloc( sizeof(*g) ) ;
g->nitems = 10 ;
g->nsize = sizeof( float ) ;
g->items = malloc( g->nsize * g->nitems ) ;
float* f = g->items ;
f[g->nitems-1] = 1.2345f ;
...
Using the same struct definition you can allocate for a different type:
struct gen* g = malloc( sizeof(*g) ) ;
g->nitems = 10 ;
g->nsize = sizeof( int ) ;
g->items = malloc( g->nsize * g->nitems ) ;
int* i = g->items ;
...
Since you are storing the size of the type and the number of elements, it is obvious how your resize function would look like( try it ).
You will have to be careful to remember what type is used in which variable as the compiler will not warn you because you are using void*.
The code in your question invokes undefined behaviour (UB), because you de-reference a potentially invalid pointer. The cast:
(_vector_generic*)&v
... is covered by 6.3.2.3 paragraph 7:
A pointer to an object type may be converted to a pointer to a different object type. If the resulting pointer is not correctly aligned for the referenced type, the behavior is undefined. Otherwise, when converted back again, the result shall compare equal to the original pointer.
If we assume alignment requirements are met, then the cast does not invoke UB. However, there is no requirement that the converted pointer must "compare equal" with (i.e. point at the same object as) the original pointer, nor even that it points to any object at all - that is to say, the value of the pointer is unspecified - therefore, to dereference this pointer (without first ascertaining that it is equal to the original) invokes undefined behaviour.
(Many people who know C well find this odd. I think this is because they know a pointer cast usually compiles to no operation - the pointer value simply remains as it is - and therefore they see pointer conversion as purely a type conversion. However, the standard does not mandate this).
Even if the pointer after conversion did compare equal with the original pointer, 6.5 paragraph 7 (the so-called "strict aliasing rule") would not allow you to dereference it. Essentially, you cannot access the same object via two pointers with different type, with some limited exceptions.
Example:
struct a { int n; };
struct b { int member; };
struct a a_object;
struct b * bp = (struct b *) &a_object; // bp takes an unspecified value
// Following would invoke UB, because bp may be an invalid pointer:
// int m = b->member;
// But what if we can ascertain that bp points at the original object?:
if (bp == &a_object) {
// The comparison in the line above actually violates constraints
// in 6.5.9p2, but it is accepted by many compilers.
int m = b->member; // UB if executed, due to 6.5p7.
}
Lets for the sake of discussion ignore that the C standard formally says this is undefined behavior. Because undefined behavior simply means that something is beyond the scope of the language standard: anything can happen and the C standard makes no guarantees. There may however be "external" guarantees on the particular system you are using, made by those who made the system.
And in the real world where there is hardware, there are indeed such guarantees. There are just two things that can go wrong here in practice:
TYPE* having a different representation or size than void*.
Different struct padding in each struct type because of alignment requirements.
Both of these seem unlikely and can be dodged with a static asserts:
static void ct_assert (void) // dummy function never linked or called by anyone
{
struct vector v1;
struct _vector_generic v2;
static_assert(sizeof(v1.items) == sizeof(v2.items),
"Err: unexpected pointer format.");
static_assert(sizeof(v1) == sizeof(v2),
"Err: unexpected padding.");
}
Now the only thing left that could go wrong is if a "pointer to x" has same size but different representation compared to "pointer to y" on your specific system. I have never heard of such a system anywhere in the real world. But of course, there are no guarantees: such obscure, unorthodox systems may exist. In that case, it is up to you whether you want to support them, or if it will suffice to just have portability to 99.99% of all existing computers in the world.
In practice, the only time you have more than one pointer format on a system is when you are addressing memory beyond the CPU's standard address width, which is typically handled by non-standard extensions such as far pointers. In all such cases, the pointers will have different sizes and you will detect such cases with static assert above.

Casting struct pointers

Assuming code is compiled with c11 and strict aliasing enabled.
I am not searching for a different approach, I would like to focus on this specific problem and if it works or why not.
(If I unintentionally made some unrelated error let me know and I will fix it)
c11 standard says:
6.2.5.28
All pointers to structure types shall have the same representation and alignment requirements as each other.
6.7.2.1.6
a structure is a type consisting of a sequence of members, whose
storage is allocated in an ordered sequence
This means the pointer size and alignment of pointers in struct A and B are the same.
#include <stdio.h>
#include <stdlib.h>
struct S1
{
int i ;
} ;
struct S2
{
float f ;
} ;
struct A
{
struct S1* p ;
} ;
struct B
{
struct S2* p ;
} ;
int main( void )
{
Structs A and B have pointers to structs S1 and S2, and structs A and B are guaranteed to have the same size and alignment.
We have a struct B whose member pointer is a struct S2 pointer, but is pointing to some struct S1, which achieved with a void* cast.
struct S1 s1 = { 0 } ;
struct B* b = malloc( sizeof( *b ) ) ;
b->p = ( void* ) &s1 ;
That is ok, we can store the pointer, as long as we don't actually use the pointer.
But we want to.
We could cast the pointer to struct S1.
( ( struct S1* )(b->p) )->i = 123 ; //redundant brackets for emphasis
printf("%d\n" , s1.i ) ;
And use it correctly.
So far I don't see any problems, as the pointer was casted to the correct type.
But can we cast the whole struct B to struct A instead? They are the same regarding size and alignment, though the standard might complain(?), could compilers produce undefined behavior?
( ( struct A* )b)->p->i = 666 ;
printf("%d\n" , s1.i ) ;
I know the solution is to use an union( or use a void and just cast correctly any time), as the standard allows to use the member not last used to store a value.
6.5.2.3.3( 95 )
If the member used to read the contents of a union object is not the same as the member last used to store a value in the object, the appropriate part of the object representation of the value is reinterpreted as an object representation in the new type as described in 6.2.6 (a process sometimes called "type punning"). This might be a trap representation.
But, I would like to avoid this:
struct C
{
union
{
struct S1* p1 ;
struct S2* p2 ;
} ;
} ;
struct C* c = malloc( sizeof( *c ) ) ;
c->p2 = ( void* )&s1 ;
c->p1->i = 444 ;
printf("%d\n" , s1.i ) ;
return 0 ;
}
Above code without text.
What you described until this point:
But can we cast the whole struct B to struct A instead?
is all correct, but the answer to this question is unfortunately no. It is only permitted to access a struct through a pointer to incompatible type if the two structs contain a "common initial sequence", i. e. if their first few members have the same type. Since your structs don't (namely, the first members are of different types), it is not legal to access an object of type S1 through a pointer to S2 and vice versa. In particular, doing so violates the strict aliasing rule.
From C99, 6.5.7:
An object shall have its stored value accessed only by an lvalue expression that has one of the following types:76)
— a type compatible with the effective type of the object,
— a qualified version of a type compatible with the effective type of the object,
— a type that is the signed or unsigned type corresponding to the effective type of the object,
— a type that is the signed or unsigned type corresponding to a qualified version of the effective type of the object,
— an aggregate or union type that includes one of the aforementioned types among its members (including, recursively, a member of a subaggregate or contained union), or
— a character type.
In the expression ((struct A *) b)->p->i, the access to p violates C 2011 6.5 7, which says “An object shall have its stored value accessed only by an lvalue expression that has one of the follow types: a type compatible with the effective type of the object,…”. b->p is a pointer to struct S2, but ((struct A *) b)->p is an lvalue expression with type pointer to struct S1. Although the representations of these pointers may be identical, they are not compatible types.
I think that in this particular case your example will work and is conform to the standard. ANSI standard says:
A pointer to a structure object, suitably
cast, points to its initial member (or if that member is a bit-field,
then to the unit in which it resides), and vice versa. There may
therefore be unnamed holes within a structure object, but not at its
beginning, as necessary to achieve the appropriate alignment.
The pointer p in your example is always the first (and unique) field of the structure. In my understanding of the previous paragraph a pointer to a struct A is the same as a pointer to A::p (excuse me for the C++ notation) which is the same as a pointer to B::p which is the same as pointer to B. Explicit casting does not change the value of the pointer, so your example shall be conform to the standard.
Useless to say that it is not very beautiful and your boss will probably not appreciate this style of programming.

Can we use va_arg with unions?

6.7.2.1 paragraph 14 of my draft of the C99 standard has this to say about unions and pointers (emphasis, as always, added):
The size of a union is sufficient to contain the largest of its members. The value of at
most one of the members can be stored in a union object at any time. A pointer to a
union object, suitably converted, points to each of its members (or if a member is a bit-
field, then to the unit in which it resides), and vice versa.
All well and good, that means that it is legal to do something like the following to copy either a signed or unsigned int into a union, assuming we only want to copy it out into data of the same type:
union ints { int i; unsigned u; };
int i = 4;
union ints is = *(union ints *)&i;
int j = is.i; // legal
unsigned k = is.u; // not so much
7.15.1.1 paragraph 2 has this to say:
The va_arg macro expands to an expression that has the specified type and the value of
the next argument in the call. The parameter ap shall have been initialized by the
va_start or va_copy macro (without an intervening invocation of the va_end macro for the sameap). Each invocation of the va_arg macro modifies ap so that the values of successive arguments are returned in turn. The parameter type shall be a type name specified such that the type of a pointer to an object that has the specified type can be obtained simply by postfixing a * to type. If there is no actual next argument, or if type is not compatible with the type of the actual next argument (as promoted according to the default argument promotions), the behavior is undefined, except for the following cases:
—one type is a signed integer type, the other type is the corresponding unsigned integer
type, and the value is representable in both types;
—one type is pointer to void and the other is a pointer to a character type.
I'm not going to go and cite the part about default argument promotions. My question is: is this defined behavior:
void func(int i, ...)
{
va_list arg;
va_start(arg, i);
union ints is = va_arg(arg, union ints);
va_end(arg);
}
int main(void)
{
func(0, 1);
return 0;
}
If so, it would appear to be a neat trick to overcome the "and the value is compatible with both types" requirement of signed/unsigned integer conversion (albeit in a way that's rather difficult to do anything with legally). If not, it would appear to be safe to just use unsigned in this case, but what if there were more elements in the union with more incompatible types? If we can guarantee that we won't access the union by element (i.e. we just copy it into another union or storage space that we're treating like a union) and that all elements of the union are the same size, is this allowed with varargs? Or would it only be allowed with pointers?
In practice I expect this code will almost never fail, but I want to know if it's defined behavior. My current guess is that it appears not to be defined, but that seems incredibly dumb.
You have a couple things off.
A pointer to a union object, suitably converted, points to each of its members (or if a member is a bit- field, then to the unit in which it resides), and vice versa.
This does not mean that the types are compatible. In fact, they are not compatible. So the following code is wrong:
func(0, 1); // undefined behavior
If you want to pass a union,
func(0, (union ints){ .u = BLAH });
You can check by writing the code,
union ints x;
x = 1;
GCC gives an "error: incompatible types in assignment" message when compiling.
However, most implementations will "probably" do the right thing in both cases. There are some other problems...
union ints {
int i;
unsigned u;
};
int i = 4;
union ints is = *(union ints *)&i; // Invalid
int j = is.i; // legal
unsigned k = is.u; // also legal (see note)
The behavior when you dereference the address of a type using a type other than its actual type *(uinon ints *)&i is sometimes undefined (looking up the reference, but I'm pretty sure about this). However, in C99 it is permitted to access a union member other than the most recently stored union member (or is it C1x?), but the value is implementation defined and may be a trap representation.
About type punning through unions: As Pascal Cuoq notes, it's actually TC3 that defines the behavior of accessing a union element other than the most recently stored one. TC3 is the third update to C99. The good news is that this part of TC3 is really codifying existing practice — so think of it as a de facto part of C prior to TC3.
Since the standard says:
The parameter type shall be a type name specified such that the type of a pointer to an object that has the specified type can be obtained simply by postfixing a * to type.
For union ints, that condition is satisfied. Since union ints * is a perfectly good representation of a pointer to a union ints, so there is nothing in that sentence to prevent it being used to collect a value pushed onto the stack as a union.
If you cheat and try to pass a plain int or unsigned int in place of a union, then you would be invoking undefined behaviour. Thus, you could use:
union ints u1 = ...;
func(0, (union ints) { .i = 0 });
func(1, (union ints) { .u = UINT_MAX });
func(2, u1);
You could not use:
func(1, 0);
The arguments are not union types.
I don't see why you think that code should never fail in practice. It would fail on any implementation where integer types are passed by register but aggregate types (even when small) are passed on the stack, and I see nothing in the standard that forbids such implementations. A union containing an int is not a type compatible with int, even if their sizes are the same.
Back to your first code fragment, it has a problem too:
union ints is = *(union ints *)&i;
This is an aliasing violation and invokes undefined behavior. You could avoid it by using memcpy and I suppose then it would be legal..
I'm also a bit confused about your comment here:
unsigned k = is.u; // not so much
Since the value 4 is represented in both the signed and unsigned types, this should be legal, unless it's specifically forbidden as a special case.
If this doesn't answer your question, perhaps you could elaborate more on what (albeit theoretical) problem you're trying to solve.

Resources