Is this a C11 anonymous struct? - c

I was looking into the C11 draft and it says
An unnamed member of structure type with no tag is called an anonymous structure; an unnamed member of union type with no tag is called an anonymous union. The members of an anonymous structure or union are considered to be members of the containing structure or union.
So I constructed the following testcase
// struct type with no tag
typedef struct {
unsigned char a;
unsigned char b;
// ... Some other members ...
unsigned char w;
} AToW;
union
{
AToW; // <- unnamed member
unsigned char bytes[sizeof(AToW)];
} myUnion;
Clang and GCC both complain about the unnamed member, saying that the declaration has no effect. Did I do something wrong, or do they simply not support that feature yet?

No, that's not an unnamed member.
An example is:
struct outer {
int a;
struct {
int b;
int c;
};
int d;
};
The inner structure containing members b and c is an unnamed member of struct outer. The members of this unnamed member, b and c, are considered to be members of the containing structure.
This is probably more useful with a contained union rather than a contained structure. In particular, it can be used to define something similar to a Pascal or Ada variant record:
enum variant_type { t_int, t_double, t_pointer, t_pair };
struct variant {
enum variant_type type;
union {
int i;
double d;
void *p;
struct {
int x;
int y;
};
};
};
This lets you refer to i, d, and p directly as members of a struct variant object rather than creating an artificial name for the variant portion. If some variants require more than one member, you can nest anonymous structures within the anonymous union.
(It differs from Pascal and Ada in that there's no mechanism to enforce which variant is active given the value of the type member; that's C for you.)
In your example, AToW is a typedef for a struct type that you defined previously. You're not permitted to have a bare
AToW;
in the middle of a struct definition, any more than you can have a bare
int;
C11 added the ability to define a nested anonymous struct within another struct, but only by defining a new anonymous struct type at that point. You can't have an anonymous struct member of a previously defined type. The language could have been defined to permit it, and the semantics would (I think) be reasonably straightforward -- but there wasn't much point in defining two different ways to do the same thing. (For "struct" in the above, read "struct or union".)
Quoting the N1570 draft (which is very close to the released 2011 ISO C standard), section 6.7.2.1 paragraph 13:
An unnamed member whose type specifier is a structure specifier with
no tag is called an anonymous structure; an unnamed member whose type
specifier is a union specifier with no tag is called an anonymous
union. The members of an anonymous structure or union are considered
to be members of the containing structure or union. This applies
recursively if the containing structure or union is also anonymous.
A structure specifier consists of the keyword struct, followed by an optional identifier (the tag, omitted in this case), followed by a sequence of declarations enclosed in { and }. In your case, AToW is a type name, not a structure specifier, so it can't be used to define an anonymous structure.

Related

What is the rationale for "structure with flexible array member shall not be a member of a structure"?

C11, 6.7.2.1 Structure and union specifiers, Constraints, 3 (emphasis added):
A structure or union shall not contain a member with incomplete or function type (hence,
a structure shall not contain an instance of itself, but may contain a pointer to an instance
of itself), except that the last member of a structure with more than one named member
may have incomplete array type; such a structure (and any union containing, possibly
recursively, a member that is such a structure) shall not be a member of a structure or an
element of an array.
Rationale for C, Revision 5.10, April-2003 (emphasis added):
Similarly, structures containing flexible arrays can’t occur in other structures or in arrays.
So, Rationale for C doesn't provide a rationale. What is the rationale?
A structure with a flexible array member can only really be used properly if allocated dynamically. For example:
struct s1 {
int a;
int b;
int c[];
};
...
struct s1 *x = malloc(sizeof(struct s1) + 5 * sizeof(int));
Let's assume typical struct padding and sizeof(int)==4. That would make sizeof(struct s1)==8.
Now imagine if such a struct was a member of another struct:
struct s2 {
int a;
struct s1 b;
int c;
};
The b member of struct s2 would start at offset 4. But what about the c member? Given that sizeof(struct s1)==8, that would make member c have offset 12. But then there's no way for the b member to have any space set aside for its containing c member.
Because the offset of a given struct member is set at compile time, there is no way to allocate space for the flexible array member c inside of struct s1.
In theory, if the struct with a flexible array member was the last member:
struct s2 {
int a;
int b;
struct s1 c;
};
Then it could work, but then then means that struct s2 is also subject to the same rules as a struct with a flexible array member, causing a cascading effect. This would make it more difficult to determine which structures are subject to this rule.
So a struct with a flexible array member is not allowed as a member of another struct or an array.

struct packing: how to add struct members at the beginning?

I'm implementing a binary tree in C89, and I'm trying to share common attributes among all node structs through composition. Thus I have the following code:
enum foo_type
{
FOO_TYPE_A,
FOO_TYPE_B
};
struct foo {
enum foo_type type;
};
struct foo_type_a {
struct foo base;
struct foo * ptr;
};
struct foo_type_b {
struct foo base;
char * text;
};
I'm including a member of type struct foo in all struct definitions as their initial member in order to provide access to the value held by enum foo_type regardless of struct type. To achieve this I'm expecting that a pointer to a structure object points to its initial member, but I'm not sure if this assumption holds in this case. With C99, the standard states the following (see ISO/IEC 9899:1999 6.7.2.1 §13)
A pointer to a structure object, suitably converted, points to its initial member (or if that member is a bit-field, then to the unit in which it resides), and vice versa. There may be unnamed padding within a structure object, but not at its beginning.
Although all structs share a common struct foo object as their initial member, padding comes into play. While struct foo only has a single member which is as int size, both struct foo_type_a and struct foo_type_b include pointer members, which in some cases increase the alignment and thus adds padding.
So, considering this scenario, does the C programming language (C89 or any subsequent version) ensures that it's safe to access the value of struct foo::type through a pointer to an object, whether that object is of type struct foo or includes an object of type struct foo as its first member, such as struct foo_type_a or struct foo_type_b?
As you yourself quote from the C Standard, what you describe is supported by C99 and later versions.
Is appears it was also supported by C89 as the language you quoted was already present in the ANSI-C document from 1988:
3.5.2.1 Structure and union specifiers
...
Within a structure object, the non-bit-field members and the units
in which bit-fields reside have addresses that increase in the order
in which they are declared. A pointer to a structure object, suitably
cast, points to its initial member (or if that member is a bit-field,
then to the unit in which it resides), and vice versa. There may
therefore be unnamed holes within a structure object, but not at its
beginning, as necessary to achieve the appropriate alignment.

Structure Confusion

Its been a while that I am not in touch with the C language, so I was just going through some of the concepts but could not find any good source on structures.
Can anyone please explain
struct A
{
int a;
char b;
float c;
};
Is this the declaration or the definition of the structure A.
It declares a struct with the struct tag A and the specified members. It does neither define nor reserve any storage for an object.
From the C99 Standard, 6.7 Declarations:
Semantics
5 A declaration specifies the interpretation and attributes of a set of
identifiers. A definition of an identifier is a declaration for that
identifier that:
— for an object, causes storage to be reserved for
that object;
— for a function, includes the function body; (footnote 98)
— for an
enumeration constant or typedef name, is the (only) declaration of the
identifier.
For a definition, you would need to provide an object identifier before the final semicolon:
struct A
{
int a;
char b;
float c;
} mystruct;
To also initialize mystruct you would write
struct A
{
int a;
char b;
float c;
} mystruct = { 42, 'x', 3.14 };
It is a declaration.
struct A; is a forward declaration or incomplete declaration.
struct A
{
int a;
char b;
float c;
};
is complete struct declaration.
Example
Also check comp.lang.c FAQ list Question 11.5
After forward declaration of struct, you can use structure pointers but can not dereference the pointers or use sizeof operator or create instances of the struct.
After declaration, you can also use struct objects, apply the sizeof operator etc.
From 6.7.2.1 Structure and union specifiers from C11 specs
8 The type is incomplete until immediately after the } that terminates
the list, and complete thereafter.
And from 6.7.2.3 Tags
If a type specifier of the form struct-or-union identifier occurs
other than as part of one of the above forms, and no other declaration
of the identifier as a tag is visible, then it declares an incomplete
structure or union type, and declares the identifier as the tag of
that type.131)131A similar construction with enum does not exist
This should not be confused with extern struct A aa; v/s struct A aa ={/*Some values*/}; which are declaration and definitions of object aa.

Does the standard say these types can alias?

Something weird I thought of when reading up on strict aliasing.
Quote on the aliasing rules from the C standard:
An object shall have its stored value accessed only by an lvalue that has one of the following types:
the declared type of the object,
a qualified version of the declared type of the object,
a type that is the signed or unsigned type corresponding to the declared type of the object,
a type that is the signed or unsigned type corresponding to a qualified version of the declared type of the object,
an aggregate or union type that includes one of the aforementioned types among its members (including, recursively, a member of a subaggregate or contained union), or
a character type.
Does this mean that if I declare a variable of a type, say,
struct struct1 {
int a;
};
/* ... */
/* an object. The declared type of the object is struct struct1 */
struct struct1 test;
And declare another of a type, say,
struct struct2 { /* an aggregate or union type that includes... */
int a;
struct struct1 test; /* ...one of the aforementioned types among its members:
(the declared type of the object) */
};
/* ... */
struct struct2 test2;
Are they technically supposed to alias, as per the quote above? That seems very wrong.
What am I missing?

Struct pointer compatibility

Suppose we have two structs:
typedef struct Struct1
{
short a_short;
int id;
} Struct1;
typedef struct Struct2
{
short a_short;
int id;
short another_short;
} Struct2;
Is it safe to cast from Struct2 * to Struct1 * ? What does the ANSI spec says about this?
I know that some compilers have the option to reorder structs fields to optimize memory usage, which might render the two structs incompatible. Is there any way to be sure this code will be valid, regardless of the compiler flag?
Thank you!
It is safe, as far as I know.
But it's far better, if possible, to do:
typedef struct {
Struct1 struct1;
short another_short;
} Struct2;
Then you've even told the compiler that Struct2 starts with an instance of Struct1, and since a pointer to a struct always points at its first member, you're safe to treat a Struct2 * as a Struct1 *.
struct pointers types always have the same representation in C.
(C99, 6.2.5p27) "All pointers to structure types shall have the same
representation and alignment requirements as each other."
And members in structure types are always in order in C.
(C99, 6.7.2.1p5) "a structure is a type consisting of a sequence of
members, whose storage is allocated in an ordered sequence"
No, the standard does't allow this; accessing the elements of a Struct2 object through a Struct1 pointer is undefined behavior. Struct1 and Struct2 are not compatible types (as defined in 6.2.7) and may be padded differently, and accessing them via the wrong pointer also violates aliasing rules.
The only way something like this is guaranteed to work is when Struct1 is included in Struct2 as its initial member (6.7.2.1.15 in the standard), as in unwind's answer.
The language specification contains the following guarantee
6.5.2.3 Structure and union members
6 One special guarantee is made in order to simplify the use of unions: if a union contains
several structures that share a common initial sequence (see below), and if the union
object currently contains one of these structures, it is permitted to inspect the common
initial part of any of them anywhere that a declaration of the completed type of the union
is visible. Two structures share a common initial sequence if corresponding members
have compatible types (and, for bit-fields, the same widths) for a sequence of one or more
initial members.
This only applies to type-punning through unions. However, this essentially guarantees that the initial portions of these struct types will have identical memory layout, including padding.
The above does not necessarily allow one to do the same by casting unrelated pointer types. Doing so might constitute a violation of aliasing rules
6.5 Expressions
7 An object shall have its stored value accessed only by an lvalue expression that has one of
the following types:
— a type compatible with the effective type of the object,
— a qualified version of a type compatible with the effective type of the object,
— a type that is the signed or unsigned type corresponding to the effective type of the
object,
— a type that is the signed or unsigned type corresponding to a qualified version of the
effective type of the object,
— an aggregate or union type that includes one of the aforementioned types among its
members (including, recursively, a member of a subaggregate or contained union), or
— a character type.
The only question here is whether accessing
((Struct1 *) struct2_ptr)->a_short
constitutes access to the whole Struct2 object (in which case it is a violation of 6.5/7 and it is undefined), or merely access to a short object (in which case it might be perfectly defined).
It general, it might be a good idea to stick to the following rule: type-punning is allowed through unions but not through pointers. Don't do it through pointers, even if you are dealing with two struct types with a common initial subsequence of members.
It will most probably work. But you are very correct in asking how you can be sure this code will be valid. So: somewhere in your program (at startup maybe) embed a bunch of ASSERT statements which make sure that offsetof( Struct1.a_short ) is equal to offsetof( Struct2.a_short ) etc. Besides, some programmer other than you might one day modify one of these structures but not the other, so better safe than sorry.
Yes, it is ok to do that!
A sample program is as follows.
#include <stdio.h>
typedef struct Struct1
{
short a_short;
int id;
} Struct1;
typedef struct Struct2
{
short a_short;
int id;
short another_short;
} Struct2;
int main(void)
{
Struct2 s2 = {1, 2, 3};
Struct1 *ptr = &s2;
void *vp = &s2;
Struct1 *s1ptr = (Struct1 *)vp;
printf("%d, %d \n", ptr->a_short, ptr->id);
printf("%d, %d \n", s1ptr->a_short, s1ptr->id);
return 0;
}

Resources