Consider an arbitrary struct where the C compiler will perform padding
struct node {
enum type;
size_t num_children;
void** nodes;
};
Will C ever perform padding before the first element? I ask this as I need to do some funky things with void* and require that
void* a = node->nodes[0];
enum type t = *(enum type*)(a);
will always be evaluated correctly. I'm aware that I can force no padding but would rather not.
Will C ever perform padding before the first element?
No. This is explicitly prohibited in the C standard:
Within a structure object, the non-bit-field members and the units in
which bit-fields reside have addresses that increase in the order in
which they are declared. A pointer to a structure object, suitably
converted, points to its initial member (or if that member is a
bit-field, then to the unit in which it resides), and vice versa.
There may be unnamed padding within a structure object, but not at its
beginning.
(emphasis mine).
Related
I'm implementing a binary tree in C89, and I'm trying to share common attributes among all node structs through composition. Thus I have the following code:
enum foo_type
{
FOO_TYPE_A,
FOO_TYPE_B
};
struct foo {
enum foo_type type;
};
struct foo_type_a {
struct foo base;
struct foo * ptr;
};
struct foo_type_b {
struct foo base;
char * text;
};
I'm including a member of type struct foo in all struct definitions as their initial member in order to provide access to the value held by enum foo_type regardless of struct type. To achieve this I'm expecting that a pointer to a structure object points to its initial member, but I'm not sure if this assumption holds in this case. With C99, the standard states the following (see ISO/IEC 9899:1999 6.7.2.1 §13)
A pointer to a structure object, suitably converted, points to its initial member (or if that member is a bit-field, then to the unit in which it resides), and vice versa. There may be unnamed padding within a structure object, but not at its beginning.
Although all structs share a common struct foo object as their initial member, padding comes into play. While struct foo only has a single member which is as int size, both struct foo_type_a and struct foo_type_b include pointer members, which in some cases increase the alignment and thus adds padding.
So, considering this scenario, does the C programming language (C89 or any subsequent version) ensures that it's safe to access the value of struct foo::type through a pointer to an object, whether that object is of type struct foo or includes an object of type struct foo as its first member, such as struct foo_type_a or struct foo_type_b?
As you yourself quote from the C Standard, what you describe is supported by C99 and later versions.
Is appears it was also supported by C89 as the language you quoted was already present in the ANSI-C document from 1988:
3.5.2.1 Structure and union specifiers
...
Within a structure object, the non-bit-field members and the units
in which bit-fields reside have addresses that increase in the order
in which they are declared. A pointer to a structure object, suitably
cast, points to its initial member (or if that member is a bit-field,
then to the unit in which it resides), and vice versa. There may
therefore be unnamed holes within a structure object, but not at its
beginning, as necessary to achieve the appropriate alignment.
Consider this code:
// T is *any* type
struct str_T{
T a, b;
};
I know that there's (almost always) padding between objects with different alignments because both members are of type T. But this time there's no different alignments. Can this assertion always pass?
static_assert(sizeof(str_T) == 2 * sizeof(T));
// i.e. padding-free
No, this is not guaranteed.
Compiler can always decide to pad or to not pad extra bits between struct members. (Unless overridden)
Quoting from C11 draft, 6.7.2.1 Structure and union specifiers
Within a structure object, the non-bit-field members and the units in
which bit-fields reside have addresses that increase in the order in
which they are declared. A pointer to a structure object, suitably
converted, points to its initial member (or if that member is a
bit-field, then to the unit in which it resides), and vice versa.
There may be unnamed padding within a structure object, but not at its
beginning
No, There is no guaranteed that it's the same memory layout.
C11 6.7.2.1(p6):
a structure is a type consisting of a sequence of members, whose
storage is allocated in an ordered sequence
The standard doesn't enforce any layouting rules.
Providing struct test:
#include <stdio.h>
int main() {
struct {
char* one;
char* two;
char* three;
} test;
char **ptr = &test.one;
*ptr = "one";
*++ptr = "two";
*++ptr = "three";
printf ("%s\n", test.one);
printf ("%s\n", test.two);
printf ("%s\n", test.three);
}
Question: Is there a guarantee that the elements in test struct are always in consecutive memory order? (So starting with the first struct element ++ptr will always point to the next element in the test struct?)
For pointers, most certainly you'll always observe that behaviour, but the only guarantee the C language makes is that the elements are ordered in memory. There might gaps in between to optimize the alignment of the fields (for performance, specially on RISC architectures).
The right way to do this is use the offsetof macro, or make it an array.
As noted in a comment to the question:
Yes; the elements of a structure are stored in the order they are declared. What can upset calculations is that there may be gaps (padding) between elements. It won't happen when they're all the same type.
However, you should ask yourself: if I need to step through the elements sequentially, why am I not using an array? Arrays are designed for sequential access; structures are not (and writing code to access the elements of a structure sequentially is messy).
Some relevant parts of the standard are:
§6.7.2.1 Structure and union specifiers
¶6 As discussed in 6.2.5, a structure is a type consisting of a sequence of members, whose
storage is allocated in an ordered sequence, and a union is a type consisting of a sequence
of members whose storage overlap.
¶15 Within a structure object, the non-bit-field members and the units in which bit-fields
reside have addresses that increase in the order in which they are declared. A pointer to a
structure object, suitably converted, points to its initial member (or if that member is a
bit-field, then to the unit in which it resides), and vice versa. There may be unnamed
padding within a structure object, but not at its beginning.
§6.2.5 Types
¶20 …
A structure type describes a sequentially allocated nonempty set of member objects
(and, in certain circumstances, an incomplete array), each of which has an optionally
specified name and possibly distinct type.
Is a pointer to the struct aligned as if it were a pointer to the first element?
or
Is a conversion between a pointer to a struct and a pointer to the type of its first member (or visa versa) ever UB?
(I hope they are the same question...)
struct element
{
tdefa x;
tdefb y;
};
int foo(struct element* e);
int bar(tdefa* a);
~~~~~
tdefa i = 0;
foo((struct element*)&i);
or
struct element e;
bar((tdefa*)&e);
Where tdefa and tdefb could be defined as any type
Background:
I asked this question
and a user in a comment on one of the answers brought up C11 6.3.2.3 p7 that states:
"A pointer to an object type may be converted to a pointer to a
different object type. If the resulting pointer is not correctly
aligned for the referenced type, the behavior is undefined"
However I am having trouble working out when this would become an issue, my understanding was that padding would allow all members of the struct to be aligned correctly. Have I misunderstood?
and if:
struct element e;
tdefa* a = &e.x;
would work then:
tdefa* a = (tdefa*)&e;
would too.
There is never any initial padding; the first member of a struct is required to start at the same address as the struct itself.
You can always access the first member of a struct by casting a pointer to the whole struct, to be a pointer to the type of the first member.
Your foo example might run into trouble because foo will be expecting its argument to point to a struct element which in fact it does not, and there might be an alignment mismatch.
However the bar example and the final example is fine.
A pointer to a structure always points to its initial member.
Here is the citation directly from C99 standard (6.7.2.1, paragraph 13), emphasis mine:
Within a structure object, the non-bit-field members and the units in which bit-fields reside have addresses that increase in the order in which they are declared. A pointer to a structure object, suitably converted, points to its initial member (or if that member is a bit-field, then to the unit in which it resides), and vice versa. There may be unnamed padding within a structure object, but not at its beginning
As for your foo and bar examples:
The call to bar will be fine, as bar expects a tdefa, which is exactly what it's getting.
The call to foo, however, is problematic. foo expects a full struct element, but you're only passing a tdefa (while the struct consists of both tdefa and tdefb).
Consider the following two struct:
struct a
{
int a;
};
struct b
{
struct a a_struct;
int b;
};
the following instantiation of struct b:
struct b b_struct;
and this condition:
if (&b_struct == (struct b*)&b_struct.a_struct)
printf("Yes\n");
Does the C standard mandate this to always evaluate true?
Yes, according to 6.7.2.1, "Within a structure object, the non-bit-field members and the units in which bit-fields reside have addresses that increase in the order in which they are declared. A pointer to a structure object, suitably converted, points to its initial member (or if that member is a bit-field, then to the unit in which it resides), and vice versa. There may be unnamed padding within a structure object, but not at its beginning."
Can't find it in the C Standard, but the answer is "yes" - the C++ Standard says:
A pointer to a POD-struct object,
suitably converted using a
reinterpret_cast, points to its
initial member (or if that member is a
bit-field, then to the unit in which
it resides) and vice versa. [Note:
There might therefore be unnamed
padding within a POD-struct object,
but not at its beginning, as necessary
to achieve appropriate alignment. ]
As C and C++ POD objects must be compatible, the same must be true for C.
Yes.
There must not be any padding in front of the first member.
The address of a structure is the same as the address of its first member, provided that the appropriate cast is used.
resource