I'm designing an application and came across an implementation issue. I have the following struct definition:
app.h:
struct application_t{
void (*run_application)(struct application_t*);
void (*stop_application)(struct application_t*);
}
struct application_t* create();
The problem came when I tried to "implement" this application_t. I tend to define another struct:
app.c:
struct tcp_application_impl_t{
void (*run_application)(struct application_t*);
void (*stop_application)(struct application_t*);
int client_fd;
int socket_fd;
}
struct application_t* create(){
struct tcp_application_impl_t * app_ptr = malloc(sizeof(struct tcp_application_impl_t));
//do init
return (struct application_t*) app_ptr;
}
So if I use this as follows:
#include "app.h"
int main(){
struct application_t *app_ptr = create();
(app_ptr -> run_application)(app_ptr); //Is this behavior well-defined?
(app_ptr -> stop_application)(app_ptr); //Is this behavior well-defined?
}
The problem confusing me is if I this calling to (app_ptr -> run_application)(app_ptr); yeilds UB.
The "static type" of app_ptr if struct application_t*, but the "dynamic type" is struct tcp_application_impl_t*. The struct application_t and struct tcp_application_t are not compatible by N1570 6.2.7(p1):
there shall be a one-to-one correspondence between their members such
that each pair of corresponding members are declared with compatible
types
which obviously is not true in this case.
Can you please provide a reference to the Standard explaining the behavior?
Your two structs aren't compatible since they are different types. You have already found the chapter "compatible types" that defines what makes two structs compatible. The UB comes later when you access these structs with a pointer to the wrong type, strict aliasing violation as per 6.5/7.
The obvious way to solve this would have been this:
struct tcp_application_impl_t{
struct application_t app;
int client_fd;
int socket_fd;
}
Now the types may alias, since tcp_application_impl_t is an aggregate containing a application_t among its members.
An alternative to make this well-defined, is to use a sneaky special rule of "union common initial sequence", found hidden in C17 6.5.2.3/6:
One special guarantee is made in order to simplify the use of unions: if a union contains
several structures that share a common initial sequence (see below), and if the union
object currently contains one of these structures, it is permitted to inspect the common
initial part of any of them anywhere that a declaration of the completed type of the union
is visible. Two structures share a common initial sequence if corresponding members
have compatible types (and, for bit-fields, the same widths) for a sequence of one or more initial members.
This would allow you to use your original types as you declared them. But somewhere in the same translation unit, you will have to add a dummy union typedef to utilize the above rule:
typedef union
{
struct application_t app;
struct tcp_application_impl_t impl;
} initial_sequence_t;
You don't need to actually use any instance of this union, it just needs to sit there visible. This tells the compiler that these two types are allowed to alias, as far as their common initial sequence goes. In your case, it means the function pointers but not the trailing variables in tcp_application_impl_t.
Edit:
Disclaimer. The common initial sequence trick is apparently a bit controversial, with compilers doing other things with it than the committee intended. And possibly works differently in C and C++. See union 'punning' structs w/ "common initial sequence": Why does C (99+), but not C++, stipulate a 'visible declaration of the union type'?
If the "strict aliasing rule" (N1570 6.5p7) is interpreted merely as specifying the circumstances under which things may alias (which would seem to be what the authors intended, given Footnote 88, which says "The intent of this list is to specify those circumstances in which an object may or may not be aliased") code like yours should pose no problem provided that in all contexts where an object is accessed using lvalues of two different types, one of the involved lvalues is visibly freshly derived from the other.
The only way 6.5p7 can make any sense is if operations involving objects that are freshly visibly derived from other objects are recognized as operations on the originals. The question of when to recognize such derivation is left as a quality-of-implementation issue, however, and thought the marketplace would be better able to judge than the Committee what was necessary for something to be a "quality" implementation suitable for some particular purpose.
If the goal is to write code that will work on implementations which are configured to honor the clear intention of footnote 88, one should be safe provided that objects don't alias. Upholding this requirement may require that one ensure that the compiler can see either that pointers are related to each other, or that they are each freshly derived from a common object at point of use. Given, e.g.
thing1 *p1 = unionArray[i].member1;
int v1 = p1->x;
thing2 *p2 = unionArray[j].member2;
p2->x = 31;
thing1 *p3 = unionArray[i].member1;
int v2 = p3->x;
each pointer would be used in a context where it was freshly derived from unionArray, and thus there would be no aliasing even if i==j. A compiler like "icc" will have no problems with such code, even with -fstrict-aliasing enabled, but because both gcc and clang impose the requirements of 6.5p7 upon programmers even in cases not involving aliasing, they will not process it correctly.
Note that if the code had been:
thing1 *p1 = unionArray[i].member1;
int v1 = p1->x;
thing2 *p2 = unionArray[j].member2;
p2->x = 31;
int v2 = p1->x;
then the second use of p1 would alias p2 in cases where i==j because p2 would access the storage associated with p1, via means not involving p1, between the time p1 is formed and the last time it is used (thus aliasing p1).
According to the authors of the Standard, the Spirit of C includes the principles "Trust the programmer" and "Don't prevent the programmer from doing what needs to be done". Unless there is a particular need to cope with the limitations of an implementation that is not particularly well suited to what one is doing, one should target implementations that uphold the Spirit of C in a fashion appropriate to one's purposes. The -fstrict-aliasing dialect processed by icc, or the -fno-strict-aliasing dialects processed by icc, gcc, and clang, should be suitable for your purposes. The -fstrict-aliasing dialects of gcc and clang should be recognized as simply unsuitable for your purposes, and not worth targeting.
Related
First, I apologize if this appears to be a duplicate, but I couldn't find exactly this question elsewhere
I was reading through N1570, specifically §6.5¶7, which reads:
An object shall have its stored value accessed only by an lvalue expression that has one of the following types:
— a type compatible with the effective type of the object,
— a qualified version of a type compatible with the effective type of the object,
— a type that is the signed or unsigned type corresponding to the effective type of the object,
— a type that is the signed or unsigned type corresponding to a qualified version of the effective type of the object,
— an aggregate or union type that includes one of the aforementioned types among its members (including, recursively, a member of a subaggregate or contained union), or
— a character type.
This reminded me of a common idiom I had seen in (BSD-like) socket programming, especially in the connect() call. Though the second argument to connect() is a struct sockaddr *, I have often seen passed to it a struct sockaddr_in *, which appears to work because they share a similar initial element. My question is:
To which contingency detailed in the above rule does this situation apply and why, or is it now undefined behavior that's an artifact of previous standard(s)?
This behavior is not defined by the C standard.
The behavior is defined by The Single Unix Specification and/or other documents relating to the software you are using, albeit in part implicitly.
The phrasing that “An object shall have its stored value accessed only by…” is misleading. The C standard cannot compel you to do anything; you are not obligated to obey its “shall” requirements. In terms of the C standard, the only consequence of not obeying its requirements is that the C standard does not define the behavior. This does not prohibit other documents from defining the behavior.
In the netinet/in.h documentation, we see “The sockaddr_in structure is used to store addresses for the Internet protocol family. Values of this type must be cast to struct sockaddr for use with the socket interfaces defined in this document.” So the documentation tells us not only that we should, but that we must, convert a sockaddr_in to a sockaddr. The fact that we must do so implies that the software supports it and that it will work. (Note that the phrasing is imprecise here; we do not actually cast a sockaddr_in to a sockaddr but actually convert the pointer, causing the sockaddr_in object in memory to be treated as a sockaddr.)
Thus there is an implied promise that the operating system, libraries, and developer tools provided for a Unix implementation support this.
This is an extension to the C language: Where behavior is not defined by the C standard, other documents may provide definitions and allow you to write software that cannot be written using the C standard alone. Behavior that the C standard says is undefined is not behavior that is prohibited but rather is an empty space that may be filled in by other specifications.
The rules about common initial sequences goes back to 1974. The earliest rules about "strict aliasing" only go back to 1989. The intention of the latter was not that they trump everything else, but merely that compilers be allowed to perform optimizations that their customers would find useful without being branded non-conforming. The Standard makes clear that in situations where one part of the Standard and/or an implementation's documentation would describe the behavior of some action but another part of the Standard would characterize it as Undefined Behavior, implementations may opt to give priority to the first, and the Rationale makes clear that the authors thought "the marketplace" would be better placed than the Committee to determine when implementations should do so.
Under a sufficiently pedantic reading of the N1570 6.5p7 constraints, almost all programs violate them, but in ways that won't matter unless an implementation is being sufficiently obtuse. The Standard makes no attempt to list all the situations in which an object of one type may be accessed by an lvalue of another, but rather those where a compiler must allow for an object of one type to be accessed by a seemingly unrelated lvalue of another. Given the code sequence:
int x;
int *p[10];
p[2] = &someStruct.intMember;
...
*p[2] = 23;
x = someStruct.intMember;
In the absence of the rules in 6.5p7, unless a compiler kept track of where p[2] came from, it would have no reason to recognize that the read of someStruct.member might be targeting storage that was just written using *p[2]. On the other hand, given the code:
int x;
int *p[10];
...
someStruct.intMember = 12;
p[2] = &someStruct.intMember;
x = *p[2];
Here, there is no rule that would actually allow the storage associated with a structure to be accessed by an lvalue of that member type, but unless a compiler is being deliberately blind, it would be able to see that after the first assignment to someStruct.intMember, the address of that member is being taken, and should either:
Account for all actions that will ever be done with the resulting pointer, if it is able to do so, or
Refrain from assuming that the structure's storage will not be accessed between the previous and succeeding actions using the structure's type.
I don't think it ever occurred to the people who were writing the rules that would later be renumbered as N1570 6.5p7 that they would be construed so as to disallow common patterns that exploited the Common Initial Sequence rule. As noted, most programs violate the constraints of 6.5p7, but do so in ways that would be processed predictably by any compiler that isn't being obtuse; those using the Common Initial Sequence guarantees would have fallen into that category. Since the authors of the Standard recognized the possibility of a "conforming" compiler that was only capable of meaningfully processing one contrived and useless program, the fact that an obtuse compiler could abuse the "aliasing rules" wasn't seen as a defect.
Is it OK do do something like this?
struct MyStruct {
int x;
const char y; // notice the const
unsigned short z;
};
struct MyStruct AStruct;
fread(&MyStruct, sizeof (MyStruct), 1,
SomeFileThatWasDefinedEarlierButIsntIncludedInThisCodeSnippet);
I am changing the constant struct member by writing to the entire struct from a file. How is that supposed to be handled? Is this undefined behavior, to write to a non-constant struct, if one or more of the struct members is constant? If so, what is the accepted practice to handle constant struct members?
It's undefined behavior.
The C11 draft n1570 says:
6.7.3 Type qualifiers
...
...
If an attempt is made to modify an object defined with a const-qualified type through use of an lvalue with non-const-qualified type, the behavior is undefined.
My interpretation of this is: To be compliant with the standard, you are only allowed to set the value of the const member during object creation (aka initialization) like:
struct MyStruct AStruct = {1, 'a', 2}; // Fine
Doing
AStruct.y = 'b'; // Error
should give a compiler error.
You can trick the compiler with code like:
memcpy(&AStruct, &AnotherStruct, sizeof AStruct);
It will probably work fine on most systems but it's still undefined behavior according to the C11 standard.
Also see memcpy with destination pointer to const data
How are constant struct members handled in C?
Read the C11 standard n1570 and its §6.7.3 related to the const qualifier.
If so, what is the accepted practice to handle constant struct members?
It depends if you care more about strict conformance to the C standard, or about practical implementations. See this draft report (work in progress in June 2020) discussing these concerns. Such considerations depend on the development efforts allocated on your project, and on portability of your software (to other platforms).
It is likely that you won't spend the same efforts on the embedded software of a Covid respirator (or inside some ICBM) and on the web server (like lighttpd or a library such as libonion or some FastCGI application) inside a cheap consumer appliance or running on some cheap rented Linux VPS.
Consider also using static analysis tools such as Frama-C or the Clang static analyzer on your code.
Regarding undefined behavior, be sure to read this blog.
See also this answer to a related question.
I am changing the constant struct member by writing to the entire struct from a file.
Then endianness issues and file system issues are important. Consider perhaps using libraries related to JSON, to YAML, perhaps mixed to sqlite or PostGreSQL or TokyoCabinet (and the source code of all these open source libraries, or from the Linux kernel, could be inspirational).
The Standard is a bit sloppy in its definition and use of the term "object". For a statement like "All X must be Y" or "No X may be Z" to be meaningful, the definition of X must have criteria that are not only satisfied by all X, but that would unambiguously exclude all objects that aren't required to be Y or are allowed to be Z.
The definition of "object", however, is simply "region of data storage in the execution environment, the contents of which can represent values". Such a definition, however, fails to make clear whether every possible range of consecutive addresses is always an "object", or when various possible ranges of addresses are subject to the constraints that apply to "objects" and when they are not.
In order for the Standard to unambiguously classify a corner case as defined or undefined, the Committee would have to reach a consensus as to whether it should be defined or undefined. If the Committee members fundamentally disagree about whether some cases should be defined or undefined, the only way to pass a rule by consensus will be if the rule is written ambiguously in a way that allows people with contradictory views about what should be defined to each think the rule supports their viewpoint. While I don't think the Committee members explicitly wanted to make their rules ambiguous, I don't think the Committee could have been consensus for rules that weren't.
Given that situation, many actions, including updating structures that have constant members, most likely falls in the realm of actions which the Standard doesn't require implementations to process meaningfully, but which the authors of the Standard would have expected that implementations would process meaningfully anyhow.
This is in regards to some old (pre-C89) code that I'm working on. Part of that code is a small library that defines these structures in the header file:
// lib.h
struct data_node {
const struct data_node *next;
const struct data_node *prev;
void *data;
};
struct trace_node {
const struct trace_node *next;
const struct trace_node *prev;
unsigned int id;
const char *file;
int line;
};
const struct trace_node *get_trace(void);
The source file redefines those same structures, like so:
// lib.c
// does *not* include "lib.h"
struct data_node {
struct data_node *next;
struct data_node *prev;
void *data;
};
struct trace_node {
struct trace_node *next;
struct trace_node *prev;
unsigned int id;
const char *file;
int line;
struct data_node *syncData; /* not included in header file version */
};
It works like you would expect: the syncData field is not visible to client code that includes the "lib.h" header.
Background
The library maintains 2 internal lists: the trace list, and the data list. The syncData field keeps the 2 lists in sync (go figure).
If client code had access to the syncData field, it could disrupt the synchronization between the lists. But, the trace list can get pretty large, so rather than copying every node into a smaller version of the struct, it just returns the address of the sentinel node for the internal list.
Question
I've compiled this with -Wall, -Wpedantic, and -Wextra, and I can't get gcc to complain about it, both with -std=c99 and -std=c11. A hex dump of the memory shows the bytes for the hidden field, right where they ought to be.
The relevant section of the standard (6.2.7.1) says:
Two types have compatible type if their types are the same. Additional rules for
determining whether two types are compatible are described in 6.7.2 for type specifiers,
in 6.7.3 for type qualifiers, and in 6.7.5 for declarators.46) Moreover, two structure,
union, or enumerated types declared in separate translation units are compatible if their
tags and members satisfy the following requirements: If one is declared with a tag, the
other shall be declared with the same tag. If both are complete types, then the following
additional requirements apply: there shall be a one-to-one correspondence between their
members such that each pair of corresponding members are declared with compatible
types, and such that if one member of a corresponding pair is declared with a name, the
other member is declared with the same name. For two structures, corresponding
members shall be declared in the same order. For two structures or unions, corresponding
bit-fields shall have the same widths. For two enumerations, corresponding members
shall have the same values.
Which, depending on how you want to read it, could be taken to say that compatible struct definitions are restricted to ONLY having corresponding pairs of members (and no others), or that struct definitions are compatible if, where they do have corresponding pairs of members, those pairs meet the requirements.
I don't think this is Undefined Behavior. At worst, I think it may be unspecified. Should I refactor this to use 2 distinct struct definitions? Doing this would require a performance hit to allocate a new public node for each node in the internal list and copy over the public data.
This has undefined behavior, and for good reasons.
First, the text clearly states that compatible struct must have a one-to-one correspondance between fields. So the behavior is undefined if client and library access the same object. A compiler can't detect that undefined behavior, because this rule is about stiching together knowledge from two different translation units that are compiled separately. This is why you don't see any diagnostic.
The reason that your example is particularly bad practise is that not even the sizes of the two struct types aggree and maybe not even their alignment. So a client that accesses such an object will make false assumptions about optimization opportunities.
If lib.c does not include lib.h then definitions from there are not visible to lib.c so no conflicts.
C does not use function overloading and so it does not use name mangling and so if you have this declaration:
struct trace_node *get_trace(void);
in one place but function is implemented as
struct foo_trace_node *get_trace(void);
then the linker will happily link your code with that get_trace()
What you do is directly violating standards. A "one-to-one correspondance" would expect the pointer to be seen by client code as well. Your code violates the first part of that sentence.
Imagine the client code to unlink one structure from the list and create and link in one of "his" structures with the wrong size - Lib would crash sooner or later when dereferencing that pointer.
If you don't want to expose certain fields of structures, don't expose the structure at all. Hand the client code an anonymous structure pointer and expose accessor functions in the lib that return field values the client is allowed to see. Or pack the "allowed part" into an embedded structure within the larger one and hand that to client code if you want to avoid field-by-field access. It's probably also not a good idea to have client code see the link pointers of your list structure.
I found some additional information in
The New C Standard
An Economic and Cultural Commentary
Derek M. Jones
derek#knosof.co.uk
http://www.coding-guidelines.com/cbook/cbook1_2.pdf
This particular version covers the C99 standard, but since the text for the relevant section is identical in both versions of the standard, it's a wash.
Pertinent commentary:
633
Moreover, two structure, union, or enumerated types declared in
separate translation units are compatible if their tags and members
satisfy the following requirements:
Commentary
These requirements apply if the structure or union type was declared
via a typedef or through any other means. Because there can be more
than one declaration of a type in the same translation unit, these
requirements really apply to the composite type in each translation
unit. In the following list of requirements, those that only apply to
structures, unions, enumerations, or a combination thereof are
explicitly called out as such. Two types are compatible if they obey
both of the following requirements:
• Tag compatibility.
If both types have tags, both shall be the same.
If one, or neither, type has a tag, there is no requirement to be obeyed.
• Member compatibility.
Here the requirement is that for every member in both types there is a
corresponding member in the other type that has the following
properties:
The corresponding members have a compatible type.
The corresponding members either have the same name or are unnamed.
For structure types, the corresponding members are defined in the same order in their respective definitions.
For structure and union types, the corresponding members shall either both be bit-fields having the same width, or neither shall be
bit-fields.
For enumerated types, the corresponding members (the enumeration constants) have the same value.
So the verdict is unanimous. The wording in Mr. Jones' commentary is perhaps a little bit clearer (to me, at any rate).
I failed to mention in the OP that the original header file comments for the get_trace() function clearly state that the trace list is to be considered strictly read-only, so the points raised about an object of the abbreviated structure in client code finding its way back into the library code are - while still valid in the general sense - not exactly applicable in this specific case.
However, the question of compiler optimizations is bang on, especially considering how much more aggressive compiler optimizations are now, versus 35 years ago. So, I will refactor.
As previously established, a union of the form
union some_union {
type_a member_a;
type_b member_b;
...
};
with n members comprises n + 1 objects in overlapping storage: One object for the union itself and one object for each union member. It is clear, that you may freely read and write to any union member in any order, even if reading a union member that was not the last one written to. The strict aliasing rule is never violated, as the lvalue through which you access the storage has the correct effective type.
This is further supported by footnote 95, which explains how type punning is an intended use of unions.
A typical example of the optimizations enabled by the strict aliasing rule is this function:
int strict_aliasing_example(int *i, float *f)
{
*i = 1;
*f = 1.0;
return (*i);
}
which the compiler may optimize to something like
int strict_aliasing_example(int *i, float *f)
{
*i = 1;
*f = 1.0;
return (1);
}
because it can safely assume that the write to *f does not affect the value of *i.
However, what happens when we pass two pointers to members of the same union? Consider this example, assuming a typical platform where float is an IEEE 754 single precision floating point number and int is a 32 bit two's complement integer:
int breaking_example(void)
{
union {
int i;
float f;
} fi;
return (strict_aliasing_example(&fi.i, &fi.f));
}
As previously established, fi.i and fi.f refer to an overlapping memory region. Reading and writing them is unconditionally legal (writing is only legal once the union has been initialized) in any order. In my opinion, the previously discussed optimization performed by all major compilers yields incorrect code as the two pointers of different type legally point to the same location.
I somehow can't believe that my interpretation of the strict aliasing rule is correct. It doesn't seem plausible that the very optimization the strict aliasing was designed for is not possible due to the aforementioned corner case.
Please tell me why I'm wrong.
A related question turned up during research.
Please read all existing answers and their comments before adding your own to make sure that your answer adds a new argument.
Starting with your example:
int strict_aliasing_example(int *i, float *f)
{
*i = 1;
*f = 1.0;
return (*i);
}
Let's first acknowledge that, in the absence of any unions, this would violate the strict aliasing rule if i and f both point to the same object; assuming the object has no declared type, then *i = 1 sets the effective type to int and *f = 1.0 then sets it to float, and the final return (*i) then accesses an object with effective type of float via an lvalue of type int, which is clearly not allowed.
The question is about whether this would still amount to a strict-aliasing violation if both i and f point to members of the same union. For this not to be the case, it would either have to be that there is some special exemption from the strict aliasing rule that applies in this situation, or that accessing the object via *i does not (also) access the same object as *f.
On union member access via the "." member access operator, the standard says (6.5.2.3):
A postfix expression followed by the . operator and an identifier
designates a member of a structure or union object. The value is that
of the named member (95) and is an lvalue if the first expression is
an lvalue.
The footnote 95 referred to in above says:
If the member used to read the contents of a union object is not the
same as the member last used to store a value in the object, the
appropriate part of the object representation of the value is
reinterpreted as an object representation in the new type as described
in 6.2.6 (a process sometimes called ‘‘type punning’’). This might be
a trap representation.
This is clearly intended to allow type punning via a union, but it should be noted that (1) footnotes are non-normative, that is, they are not supposed to proscribe behaviour, but rather they should clarify the intention of some part of the text in accordance with the rest of the specification, and (2) this allowance for type punning via a union is deemed by compiler vendors as applying only for access via the union member access operator - since otherwise strict aliasing is pretty useless for optimisation, as just about any two pointers potentially refer to different members of the same union (your example is a case in point).
So at this point, we can say that:
the code in your example is explicitly allowed by a non-normative footnote
the normative text on the other hand seems to disallow your example (due to strict aliasing), assuming that accessing one member of a union also constitutes access to another - but more on this shortly
Does accessing one member of a union actually access the others, though? If not, the strict aliasing rule isn't concerned with the example. (If it does, the strict aliasing rule, problematically, disallows just about any type-punning via a union).
A union is defined as (6.2.5 para 20):
A union type describes an overlapping nonempty set of member objects
And note that (6.7.2.1 para 16):
The value of at most one of the members can be stored in a union object at any time
Since access is (3):
〈execution-time action〉 to read or modify the value of an object
... and, since non-active union members do not have a stored value, then presumably accessing one member does not constitute access to the other members!
However, the definition of member access (6.5.2.3, quoted above) says "The value is that of the named member" (this is the precise statement that footnote 95 is attached to) - if the member has no value, what then? Footnote 95 gives an answer but as I've noted it is not supported by the normative text.
In any case, nothing in the text would seem to imply that reading or modifying a union member "via the member object" (i.e. directly via an expression using the member access operator) should be any different than reading or modifying it via pointer to that same member. The consensus understanding applied by compiler vendors, which allows them to perform optimisations under the assumption that pointers of different types do not alias, and that requires type punning be performed only via expressions involving member access, is not supported by the text of the standard.
If footnote 95 is considered normative, your example is perfectly fine code without undefined behaviour (unless the value of (*i) is a trap representation), according to the rest of the text. However, if footnote 95 is not considered normative, there is an attempted access to an object which has no stored value and the behaviour then is at best unclear (though the strict aliasing rule is arguably not relevant).
In the understanding of compiler vendors currently, your example has undefined behaviour, but since this isn't specified in the standard it's not clear exactly what constraint the code violates.
Personally, I think the "fix" to the standard is to:
disallow access to a non-active union member except via lvalue conversion of a member access expression, or via assignment where the left-hand-side is a member access expression (an exception to this could perhaps be made for when the member in question has character type, since that would not have an effect on possible optimisations due to a similar exception in the strict aliasing rule itself)
specify in the normative text that the value of a non-active member is as is currently described by footnote 95
That would make your example not a violation of the strict aliasing rule, but rather a violation of the constraint that a non-active union member must be accessed only via an expression containing the member access operator (and appropriate member).
Therefore, to answer your question - Is the strict aliasing rule incorrectly specified? - no, the strict aliasing rule is not relevant to this example because the objects accessed by the two pointer dereferences are separate objects and, even though they overlap in storage, only one of them has a value at a time. However, the union member access rules are incorrectly specified.
A note on Defect Report 236:
Arguments about union semantics invariably refer to DR 236 at some point. Indeed, your example code is superficially very similar to the code in that Defect Report. I would note that:
The example in DR 236 is not about type-punning. It is about whether it is ok to assign to a non-active union member via a pointer to that member. The code in question is subtly different to that in the question here, since it does not attempt to access the "original" union member again after writing to the second member. Thus, despite the structural similarity in the example code, the Defect Report is largely unrelated to your question.
"Committee believes that Example 2 violates the aliasing rules in 6.5 paragraph 7" - this indicates that the committee believes that writing a "non-active" union member, but not via an expression containing a member access of the union object, is a strict-aliasing violation. As I've detailed above, this is not supported by the text of the standard.
"In order to not violate the rules, function f in example should be written as" - i.e. you must use the union object (and the "." operator) to change the active member type; this is in agreement with the "fix" to the standard I proposed above.
The Committee Response in DR 236 claims that "Both programs invoke undefined behavior". It has no explanation for why the first does so, and its explanation for why the 2nd does so seems to be wrong.
Under the definition of union members in §6.5.2.3:
3 A postfix expression followed by the . operator and an identifier designates a member of a structure or union object. ...
4 A postfix expression followed by the -> operator and an identifier designates a member of a structure or union object. ...
See also §6.2.3 ¶1:
the members of structures or unions; each structure or union has a separate name space for its members (disambiguated by the type of the expression used to access the member via the . or -> operator);
It is clear that footnote 95 refers to the access of a union member with the union in scope and using the . or -> operator.
Since assignments and accesses to the bytes comprising the union are not made through union members but through pointers, your program does not invoke the aliasing rules of union members (including those clarified by footnote 95).
Further, normal aliasing rules are violated since the effective type of the object after *f = 1.0 is float, but its stored value is accessed by an lvalue of type int (see §6.5 ¶7).
Note: All references cite this C11 standard draft.
The C11 standard (§6.5.2.3.9 EXAMPLE 3) has following example:
The following is not a valid fragment (because the union type is not
visible within function f):
struct t1 { int m; };
struct t2 { int m; };
int f(struct t1 *p1, struct t2 *p2)
{
if (p1->m < 0)
p2->m = -p2->m;
return p1->m;
}
int g()
{
union {
struct t1 s1;
struct t2 s2;
} u;
/* ... */
return f(&u.s1, &u.s2);
}
But I can't find more clarification on this.
Essentially the strict aliasing rule describes circumstances in which a compiler is permitted to assume (or, conversely, not permitted to assume) that two pointers of different types do not point to the same location in memory.
On that basis, the optimisation you describe in strict_aliasing_example() is permitted because the compiler is allowed to assume f and i point to different addresses.
The breaking_example() causes the two pointers passed to strict_aliasing_example() to point to the same address. This breaks the assumption that strict_aliasing_example() is permitted to make, therefore results in that function exhibiting undefined behaviour.
So the compiler behaviour you describe is valid. It is the fact that breaking_example() causes the pointers passed to strict_aliasing_example() to point to the same address which causes undefined behaviour - in other words, breaking_example() breaks the assumption that the compiler is allowed to make within strict_aliasing_example().
The strict aliasing rule forbids access to the same object by two pointers that do not have compatible types, unless one is a pointer to a character type:
7 An object shall have its stored value accessed only by an lvalue expression that has one of the following types:88)
a type compatible with the effective type of the object,
a qualified version of a type compatible with the effective type of the object,
a type that is the signed or unsigned type corresponding to the effective type of the object,
a type that is the signed or unsigned type corresponding to a qualified version of the effective type of the object,
an aggregate or union type that includes one of the aforementioned types among its members (including, recursively, a member of a subaggregate or contained union), or
a character type.
In your example, *f = 1.0; is modifying fi.i, but the types are not compatible.
I think the mistake is in thinking that a union contains n objects, where n is the number of members. A union contains only one active object at any point during program execution by §6.7.2.1 ¶16
The value of at most one of the members can be stored in a union object at any time.
Support for this interpretation that a union does not simultaneously contain all of its member objects can be found in §6.5.2.3:
and if the union object currently contains one of these structures
Finally, an almost identical issue was raised in defect report 236 in 2006.
Example 2
// optimization opportunities if "qi" does not alias "qd"
void f(int *qi, double *qd) {
int i = *qi + 2;
*qd = 3.1; // hoist this assignment to top of function???
*qd *= i;
return;
}
main() {
union tag {
int mi;
double md;
} u;
u.mi = 7;
f(&u.mi, &u.md);
}
Committee believes that Example 2 violates the aliasing rules in 6.5
paragraph 7:
"an aggregate or union type that includes one of the aforementioned
types among its members (including, recursively, a member of a
subaggregate or contained union)."
In order to not violate the rules, function f in example should be
written as:
union tag {
int mi;
double md;
} u;
void f(int *qi, double *qd) {
int i = *qi + 2;
u.md = 3.1; // union type must be used when changing effective type
*qd *= i;
return;
}
Here is note 95 and its context:
A postfix expression followed by the . operator and an identifier designates a member of a structure or union object. The value is that of the named member, (95) and is an lvalue if the first expression is an lvalue. If the first expression has qualified type, the result has the so-qualified version of the type of the designated member.
(95) If the member used to read the contents of a union object is not the same as the member last used to store a value in the object, the appropriate part of the object representation of the value is reinterpreted as an object representation in the new type as described in 6.2.6 (a process sometimes called “type punning”). This might be a trap representation.
Note 95 clearly applies to an access via a union member. Your code does not do that. Two overlapping objects are accessed via pointers to 2 separate types, none of which is a character type, and none of which is a postfix expression pertinent for type punning.
This is not a definitive answer...
Let's back away from the standard for a second, and think about what's actually possible for a compiler.
Suppose that strict_aliasing_example() is defined in strict_aliasing_example.c, and breaking_example() is defined in breaking_example.c. Assume both of these files are compiled separately and then linked together, like so:
gcc -c -o strict_aliasing_example.o strict_aliasing_example.c
gcc -c -o breaking_example.o breaking_example.c
gcc -o breaking_example strict_aliasing_example.o breaking_example.o
Of course we'll have to add a function prototype to breaking_example.c, which looks like this:
int strict_aliasing_example(int *i, float *f);
Now consider that the first two invocations of gcc are completely independent and cannot share information except for the function prototype. It is impossible for the compiler to know that i and j will point to members of the same union when it generates code for strict_aliasing_example(). There's nothing in the linkage or type system to specify that these pointers are somehow special because they came from a union.
This supports the conclusion that other answers have mentioned: from the standard's point of view, accessing a union via . or -> obeys different aliasing rules compared with dereferencing an arbitrary pointer.
Prior to the C89 Standard, the vast majority of implementations defined the behavior of write-dereferencing to pointer of a particular type as setting the bits of the underlying storage in the fashion defined for that type, and defined the behavior of read-dereferencing a pointer of a particular type as reading the bits of the underlying storage in the fashion defined for that type. While such abilities would not have been useful on all implementations, there were many implementations where the performance of hot loops could be greatly improved by e.g. using 32-bit loads and stores to operate on groups of four bytes at once. Further, on many such implementations, supporting such behaviors didn't cost anything.
The authors of the C89 Standard state that one of their objectives was to avoid irreparably breaking existing code, and there are two fundamental ways the rules could have been interpreted consistent with that:
The C89 rules could have been intended to be applicable only in the cases similar to the one given in the rationale (accessing an object with declared type both directly via that type and indirectly via pointer), and where compilers would not have reason to expect that lvalues are related. Keeping track for each variable whether it is currently cached in a register is pretty simple, and being able to keep such variables in registers while accessing pointers of other types is a simple and useful optimization and would not preclude support for code which uses the more common type punning patterns (having a compiler interpret a float* to int* cast as necessitating a flush of any register-cached float values is simple and straightforward; such casts are rare enough that such an approach would be unlikely to adversely affect performance).
Given that the Standard is generally agnostic with regard to what makes a good-quality implementation for a given platform, the rules could be interpreted as allowing implementations to break code which uses type punning in ways that would be both useful and obvious, without suggesting that good quality implementations shouldn't try to avoid doing so.
If the Standard defines a practical way of allowing in-place type punning which is not in any way significantly inferior to other approaches, then approaches other than the defined way might reasonably be regarded as deprecated. If no Standard-defined means exists, then quality implementations for platforms where type punning is necessary to achieve good performance should endeavor to efficiently support common patterns on those platforms whether or not the Standard requires them to do so.
Unfortunately, the lack of clarity as to what the Standard requires has resulted in a situation where some people regard as deprecated constructs for which no replacements exist. Having the existence of a complete union type definition involving two primitive types be interpreted as an indication that any access via pointer of one type should be regarded as a likely access to the other would make it possible to adjust programs which rely upon in-place type punning to do so without Undefined Behavior--something which is not achievable any other practical way given the present Standard. Unfortunately, such an interpretation would also limit many optimizations in the 99% of cases where they would be harmless, thus making it impossible for compilers which interpret the Standard that way to run existing code as efficiently as would otherwise be possible.
As to whether the rule is correctly specified, that would depend upon what it is supposed to mean. Multiple reasonable interpretations are possible, but combining them yields some rather unreasonable results.
PS--the only interpretation of the rules regarding pointer-comparisons and memcpy that would make sense without giving the term "object" a meaning different from its meaning in the aliasing rules would suggest that no allocated region can be used to hold more than a single kind of object. While some kinds of code might be able to abide such a restriction, it would make it impossible for programs to use their own memory management logic to recycle storage without excessive numbers of malloc/free calls. The authors of the Standard may have intended to say that implementations are not required to let programmers create a large region and partition it into smaller mixed-type chunks themselves, but that doesn't mean that they intended general-purpose implementations would fail to do so.
The Standard does not allow the stored value of a struct or union to be accessed using an lvalue of the member type. Since your example accesses the stored value of a union using lvalues whose type is not that of the union, nor any type that contains that union, behavior would be Undefined on that basis alone.
The one thing that gets tricky is that under a strict reading of the Standard, even something so straightforward as
int main(void)
{
struct { int x; } foo;
foo.x = 1;
return 0;
}
also violates N1570 6.5p7 because foo.x is an lvalue of type int, it is used to access the stored value of an object of type struct foo, and type int does not satisfy any of the conditions on that section.
The only way the Standard can be even remotely useful is if one recognizes that there need to be exceptions to N1570 6.5p7 in cases involving lvalues that are derived from other lvalues. If the Standard were to describe cases where compilers may or must recognize such derivation, and specify that N1570 6.5p7 only applies in cases where storage is accessed using more than one type within a particular execution of a function or loop, that would have eliminated a lot of complexity including any need for the notion of "Effective Type".
Unfortunately, some compilers have taken it upon themselves to ignore derivation of lvalues and pointers even in some obvious cases like:
s1 *p1 = &unionArr[i].v1;
p1->x ++;
It may be reasonable for a compiler to fail to recognize the association between p1 and unionArr[i].v1 if other actions involving unionArr[i] separated the creation and use of p1, but neither gcc nor clang can consistently recognize such association even in simple cases where the use of the pointer immediately follows the action which takes the address of the union member.
Again, since the Standard doesn't require that compilers recognize any usage of derived lvalues unless they are of character types, the behavior of gcc and clang does not make them non-conforming. On the other hand, the only reason they are conforming is because of a defect in the Standard which is so outrageous that nobody reads the Standard as saying what it actually does.
I recently came across a colleague's code that looked like this:
typedef struct A {
int x;
}A;
typedef struct B {
A a;
int d;
}B;
void fn(){
B *b;
((A*)b)->x = 10;
}
His explanation was that since struct A was the first member of struct B, so b->x would be the same as b->a.x and provides better readability.
This makes sense, but is this considered good practice? And will this work across platforms? Currently this runs fine on GCC.
Yes, it will work cross-platform(a), but that doesn't necessarily make it a good idea.
As per the ISO C standard (all citations below are from C11), 6.7.2.1 Structure and union specifiers /15, there is not allowed to be padding before the first element of a structure
In addition, 6.2.7 Compatible type and composite type states that:
Two types have compatible type if their types are the same
and it is undisputed that the A and A-within-B types are identical.
This means that the memory accesses to the A fields will be the same in both A and B types, as would the more sensible b->a.x which is probably what you should be using if you have any concerns about maintainability in future.
And, though you would normally have to worry about strict type aliasing, I don't believe that applies here. It is illegal to alias pointers but the standard has specific exceptions.
6.5 Expressions /7 states some of those exceptions, with the footnote:
The intent of this list is to specify those circumstances in which an object may or may not be aliased.
The exceptions listed are:
a type compatible with the effective type of the object;
some other exceptions which need not concern us here; and
an aggregate or union type that includes one of the aforementioned types among its members (including, recursively, a member of a subaggregate or contained union).
That, combined with the struct padding rules mentioned above, including the phrase:
A pointer to a structure object, suitably converted, points to its initial member
seems to indicate this example is specifically allowed for. The core point we have to remember here is that the type of the expression ((A*)b) is A*, not B*. That makes the variables compatible for the purposes of unrestricted aliasing.
That's my reading of the relevant portions of the standard, I've been wrong before (b), but I doubt it in this case.
So, if you have a genuine need for this, it will work okay but I'd be documenting any constraints in the code very close to the structures so as to not get bitten in future.
(a) In the general sense. Of course, the code snippet:
B *b;
((A*)b)->x = 10;
will be undefined behaviour because b is not initialised to something sensible. But I'm going to assume this is just example code meant to illustrate your question. If anyone's concerned about it, think of it instead as:
B b, *pb = &b;
((A*)pb)->x = 10;
(b) As my wife will tell you, frequently and with little prompting :-)
I'll go out on a limb and oppose #paxdiablo on this one: I think it's a fine idea, and it's very common in large, production-quality code.
It's basically the most obvious and nice way to implement inheritance-based object oriented data structures in C. Starting the declaration of struct B with an instance of struct A means "B is a sub-class of A". The fact that the first structure member is guaranteed to be 0 bytes from the start of the structure is what makes it work safely, and it's borderline beautiful in my opinion.
It's widely used and deployed in code based on the GObject library, such as the GTK+ user interface toolkit and the GNOME desktop environment.
Of course, it requires you to "know what you're doing", but that is generally always the case when implementing complicated type relationships in C. :)
In the case of GObject and GTK+, there's plenty of support infrastructure and documentation to help with this: it's quite hard to forget about it. It might mean that creating a new class isn't something you do just as quickly as in C++, but that's perhaps to be expected since there's no native support in C for classes.
That's a horrible idea. As soon as someone comes along and inserts another field at the front of struct B your program blows up. And what is so wrong with b.a.x?
Anything that circumvents type checking should generally be avoided.
This hack rely on the order of the declarations and neither the cast nor this order can be enforced by the compiler.
It should work cross-platform, but I don't think it is a good practice.
If you really have deeply nested structures (you might have to wonder why, however), then you should use a temporary local variable to access the fields:
A deep_a = e->d.c.b.a;
deep_a.x = 10;
deep_a.y = deep_a.x + 72;
e->d.c.b.a = deep_a;
Or, if you don't want to copy a along:
A* deep_a = &(e->d.c.b.a);
deep_a->x = 10;
deep_a->y = deep_a->x + 72;
This shows from where a comes and it doesn't require a cast.
Java and C# also regularly expose constructs like "c.b.a", I don't see what the problem is. If what you want to simulate is object-oriented behaviour, then you should consider using an object-oriented language (like C++), since "extending structs" in the way you propose doesn't provide encapsulation nor runtime polymorphism (although one may argue that ((A*)b) is akin to a "dynamic cast").
I am sorry to disagree with all the other answers here, but this system is not compliant to standard C. It is not acceptable to have two pointers with different types which point to the same location at the same time, this is called aliasing and is not allowed by the strict aliasing rules in C99 and many other standards. A less ugly was of doing this would be to use in-line getter functions which then do not have to look neat in that way. Or perhaps this is the job for a union? Specifically allowed to hold one of several types, however there are a myriad of other drawbacks there too.
In short, this kind of dirty casting to create polymorphism is not allowed by most C standards, just because it seems to work on your compiler does not mean it is acceptable. See here for an explanation of why it is not allowed, and why compilers at high optimization levels can break code which does not follow these rules http://en.wikipedia.org/wiki/Aliasing_%28computing%29#Conflicts_with_optimization
Yes, it will work. And it is one of the core principle of Object Oriented using C. See this answer 'Object-orientation in C' for more examples about extending (i.e inheritance).
This is perfectly legal, and, in my opinion, pretty elegant. For an example of this in production code, see the GObject docs:
Thanks to these simple conditions, it is possible to detect the type
of every object instance by doing:
B *b;
b->parent.parent.g_class->g_type
or, more quickly:
B *b;
((GTypeInstance*)b)->g_class->g_type
Personally, I think that unions are ugly and tend to lead towards huge switch statements, which is a big part of what you've worked to avoid by writing OO code. I write a significant amount of code myself in this style --- typically, the first member of the struct contains function pointers that can be made to work like a vtable for the type in question.
I can see how this works but I would not call this good practice. This is depending on how the bytes of each data structure is placed in memory. Any time you are casting one complicated data structure to another (ie. structs), it's not a very good idea, especially when the two structures are not the same size.
I think the OP and many commenters have latched onto the idea that the code is extending a struct.
It is not.
This is and example of composition. Very useful. (Getting rid of the typedefs, here is a more descriptive example ):
struct person {
char name[MAX_STRING + 1];
char address[MAX_STRING + 1];
}
struct item {
int x;
};
struct accessory {
int y;
};
/* fixed size memory buffer.
The Linux kernel is full of embedded structs like this
*/
struct order {
struct person customer;
struct item items[MAX_ITEMS];
struct accessory accessories[MAX_ACCESSORIES];
};
void fn(struct order *the_order){
memcpy(the_order->customer.name, DEFAULT_NAME, sizeof(DEFAULT_NAME));
}
You have a fixed size buffer that is nicely compartmentalized. It sure beats a giant single tier struct.
struct double_order {
struct order order;
struct item extra_items[MAX_ITEMS];
struct accessory extra_accessories[MAX_ACCESSORIES];
};
So now you have a second struct that can be treated (a la inheritance) exactly like the first with an explicit cast.
struct double_order d;
fn((order *)&d);
This preserves compatibility with code that was written to work with the smaller struct. Both the Linux kernel (http://lxr.free-electrons.com/source/include/linux/spi/spi.h (look at struct spi_device)) and bsd sockets library (http://beej.us/guide/bgnet/output/html/multipage/sockaddr_inman.html) use this approach. In the kernel and sockets cases you have a struct that is run through both generic and differentiated sections of code. Not all that different than the use case for inheritance.
I would NOT suggest writing structs like that just for readability.
I think Postgres does this in some of their code as well. Not that it makes it a good idea, but it does say something about how widely accepted it seems to be.
Perhaps you can consider using macros to implement this feature, the need to reuse the function or field into the macro.