Violating of strict-aliasing in C, even without any casting? - c

How can *i and u.i print different numbers in this code, even though i is defined as int *i = &u.i;? I can only assuming that I'm triggering UB here, but I can't see how exactly.
(ideone demo replicates if I select 'C' as the language. But as #2501 pointed out, not if 'C99 strict' is the language. But then again, I get the problem with gcc-5.3.0 -std=c99!)
// gcc -fstrict-aliasing -std=c99 -O2
union
{
int i;
short s;
} u;
int * i = &u.i;
short * s = &u.s;
int main()
{
*i = 2;
*s = 100;
printf(" *i = %d\n", *i); // prints 2
printf("u.i = %d\n", u.i); // prints 100
return 0;
}
(gcc 5.3.0, with -fstrict-aliasing -std=c99 -O2, also with -std=c11)
My theory is that 100 is the 'correct' answer, because the write to the union member through the short-lvalue *s is defined as such (for this platform/endianness/whatever). But I think that the optimizer doesn't realize that the write to *s can alias u.i, and therefore it thinks that *i=2; is the only line that can affect *i. Is this a reasonable theory?
If *s can alias u.i, and u.i can alias *i, then surely the compiler should think that *s can alias *i? Shouldn't aliasing be 'transitive'?
Finally, I always had this assumption that strict-aliasing problems were caused by bad casting. But there is no casting in this!
(My background is C++, I'm hoping I'm asking a reasonable question about C here. My (limited) understanding is that, in C99, it is acceptable to write through one union member and then reading through another member of a different type.)

The disrepancy is issued by -fstrict-aliasing optimization option. Its behavior and possible traps are described in GCC documentation:
Pay special attention to code like this:
union a_union {
int i;
double d;
};
int f() {
union a_union t;
t.d = 3.0;
return t.i;
}
The practice of reading from a different union member than the one
most recently written to (called “type-punning”) is common. Even with
-fstrict-aliasing, type-punning is allowed, provided the memory is accessed through the union type. So, the code above works as expected.
See Structures unions enumerations and bit-fields implementation. However, this code might
not:
int f() {
union a_union t;
int* ip;
t.d = 3.0;
ip = &t.i;
return *ip;
}
Note that conforming implementation is perfectly allowed to take advantage of this optimization, as second code example exhibits undefined behaviour. See Olaf's and others' answers for reference.

C standard (i.e. C11, n1570), 6.5p7:
An object shall have its stored value accessed only by an lvalue expression that has one of the following types:
...
an aggregate or union type that includes one of the aforementioned types among its members (including, recursively, a member of a subaggregate or contained union), or
a character type.
The lvalue expressions of your pointers are not union types, thus this exception does not apply. The compiler is correct exploiting this undefined behaviour.
Make the pointers' types pointers to the union type and dereference with the respective member. That should work:
union {
...
} u, *i, *p;

Strict aliasing is underspecified in the C Standard, but the usual interpretation is that union aliasing (which supersedes strict aliasing) is only permitted when the union members are directly accessed by name.
For rationale behind this consider:
void f(int *a, short *b) {
The intent of the rule is that the compiler can assume a and b don't alias, and generate efficient code in f. But if the compiler had to allow for the fact that a and b might be overlapping union members, it actually couldn't make those assumptions.
Whether or not the two pointers are function parameters or not is immaterial, the strict aliasing rule doesn't differentiate based on that.

This code indeed invokes UB, because you do not respect the strict aliasing rule. n1256 draft of C99 states in 6.5 Expressions §7:
An object shall have its stored value accessed only by an lvalue expression that has one of
the following types:
— a type compatible with the effective type of the object,
— a qualified version of a type compatible with the effective type of the object,
— a type that is the signed or unsigned type corresponding to the effective type of the
object,
— a type that is the signed or unsigned type corresponding to a qualified version of the
effective type of the object,
— an aggregate or union type that includes one of the aforementioned types among its
members (including, recursively, a member of a subaggregate or contained union), or
— a character type.
Between the *i = 2; and the printf(" *i = %d\n", *i); only a short object is modified. With the help of the strict aliasing rule, the compiler is free to assume that the int object pointed by i has not been changed, and it can directly use a cached value without reloading it from main memory.
It is blatantly not what a normal human being would expect, but the strict aliasing rule was precisely written to allow optimizing compilers to use cached values.
For the second print, unions are referenced in same standard in 6.2.6.1 Representations of types / General §7:
When a value is stored in a member of an object of union type, the bytes of the object
representation that do not correspond to that member but do correspond to other members
take unspecified values.
So as u.s has been stored, u.i have taken a value unspecified by standard
But we can read later in 6.5.2.3 Structure and union members §3 note 82:
If the member used to access the contents of a union object is not the same as the member last used to
store a value in the object, the appropriate part of the object representation of the value is reinterpreted
as an object representation in the new type as described in 6.2.6 (a process sometimes called "type
punning"). This might be a trap representation.
Although notes are not normative, they do allow better understanding of the standard. When u.s have been stored through the *s pointer, the bytes corresponding to a short have been changed to the 2 value. Assuming a little endian system, as 100 is smaller that the value of a short, the representation as an int should now be 2 as high order bytes were 0.
TL/DR: even if not normative, the note 82 should require that on a little endian system of the x86 or x64 families, printf("u.i = %d\n", u.i); prints 2. But per the strict aliasing rule, the compiler is still allowed to assumed that the value pointed by i has not changed and may print 100

You are probing a somewhat controversial area of the C standard.
This is the strict aliasing rule:
An object shall have its stored value accessed only by an lvalue
expression that has one of the following types:
a type compatible with the effective type of the object,
a qualified version of a type compatible with the effective type of the object,
a type that is the signed or unsigned type corresponding to
the effective type of the object,
a type that is the signed or unsigned type corresponding to a qualified version of the effective type of the object,
an aggregate or union type that includes one of the aforementioned types among its members (including, recursively, a member of a subaggregate or contained union),
a character type.
(C2011, 6.5/7)
The lvalue expression *i has type int. The lvalue expression *s has type short. These types are not compatible with each other, nor both compatible with any other particular type, nor does the strict aliasing rule afford any other alternative that allows both accesses to conform if the pointers are aliased.
If at least one of the accesses is non-conforming then the behavior is undefined, so the result you report -- or indeed any other result at all -- is entirely acceptable. In practice, the compiler must produce code that reorders the assignments with the printf() calls, or that uses a previously loaded value of *i from a register instead of re-reading it from memory, or some similar thing.
The aforementioned controversy arises because people will sometimes point to footnote 95:
If the member used to read the contents of a union object is not the same as the member last used to store a value in the object, the appropriate part of the object representation of the value is reinterpreted as an object representation in the new type as described in 6.2.6 (a process sometimes called ‘‘type punning’’). This might be a trap representation.
Footnotes are informational, however, not normative, so there's really no question which text wins if they conflict. Personally, I take the footnote simply as an implementation guidance, clarifying the meaning of the fact that the storage for union members overlaps.

Looks like this is a result of the optimizer doing its magic.
With -O0, both lines print 100 as expected (assuming little-endian). With -O2, there is some reordering going on.
gdb gives the following output:
(gdb) start
Temporary breakpoint 1 at 0x4004a0: file /tmp/x1.c, line 14.
Starting program: /tmp/x1
warning: no loadable sections found in added symbol-file system-supplied DSO at 0x2aaaaaaab000
Temporary breakpoint 1, main () at /tmp/x1.c:14
14 {
(gdb) step
15 *i = 2;
(gdb)
18 printf(" *i = %d\n", *i); // prints 2
(gdb)
15 *i = 2;
(gdb)
16 *s = 100;
(gdb)
18 printf(" *i = %d\n", *i); // prints 2
(gdb)
*i = 2
19 printf("u.i = %d\n", u.i); // prints 100
(gdb)
u.i = 100
22 }
(gdb)
0x0000003fa441d9f4 in __libc_start_main () from /lib64/libc.so.6
(gdb)
The reason this happens, as others have stated, is because it is undefined behavior to access a variable of one type through a pointer to another type even if the variable in question is part of a union. So the optimizer is free to do as it wishes in this case.
The variable of the other type can only be read directly via a union which guarantees well defined behavior.
What's curious is that even with -Wstrict-aliasing=2, gcc (as of 4.8.4) doesn't complain about this code.

Whether by accident or by design, C89 includes language which has been interpreted in two different ways (along with various interpretations in-between). At issue is the question of when a compiler should be required to recognize that storage used for one type might be accessed via pointers of another. In the example given in the C89 rationale, aliasing is considered between a global variable which is clearly not part of any union and a pointer to a different type, and nothing in the code would suggest that aliasing could occur.
One interpretation horribly cripples the language, while the other would restrict the use of certain optimizations to "non-conforming" modes. If those who didn't to have their preferred optimizations given second-class status had written C89 to unambiguously match their interpretation, those parts of the Standard would have been widely denounced and there would have been some sort of clear recognition of a non-broken dialect of C which would honor the non-crippling interpretation of the given rules.
Unfortunately, what has happened instead is since the rules clearly don't require compiler writers apply a crippling interpretation, most compiler writers have for years simply interpreted the rules in a fashion which retains the semantics that made C useful for systems programming; programmers didn't have any reason to complain that the Standard didn't mandate that compilers behave sensibly because from their perspective it seemed obvious to everyone that they should do so despite the sloppiness of the Standard. Meanwhile, however, some people insist that since the Standard has always allowed compilers to process a semantically-weakened subset of Ritchie's systems-programming language, there's no reason why a standard-conforming compiler should be expected to process anything else.
The sensible resolution for this issue would be to recognize that C is used for sufficiently varied purposes that there should be multiple compilation modes--one required mode would treat all accesses of everything whose address was taken as though they read and write the underlying storage directly, and would be compatible with code which expects any level of pointer-based type punning support. Another mode could be more restrictive than C11 except when code explicitly uses directives to indicate when and where storage that has been used as one type would need to be reinterpreted or recycled for use as another. Other modes would allow some optimizations but support some code that would break under stricter dialects; compilers without specific support for a particular dialect could substitute one with more defined aliasing behaviors.

Related

Is it legal to access struct members via offset pointers from other struct members?

In these two examples, does accessing members of the struct by offsetting pointers from other members result in Undefined / Unspecified / Implementation Defined Behavior?
struct {
int a;
int b;
} foo1 = {0, 0};
(&foo1.a)[1] = 1;
printf("%d", foo1.b);
struct {
int arr[1];
int b;
} foo2 = {{0}, 0};
foo2.arr[1] = 1;
printf("%d", foo2.b);
Paragraph 14 of C11 § 6.7.2.1 seems to indicate that this should be implementation-defined:
Each non-bit-field member of a structure or union object is aligned in an implementation-defined manner appropriate to its type.
and later goes on to say:
There may be unnamed padding within a structure object, but not at its beginning.
However, code like the following appears to be fairly common:
union {
int arr[2];
struct {
int a;
int b;
};
} foo3 = {{0, 0}};
foo3.arr[1] = 1;
printf("%d", foo3.b);
(&foo3.a)[1] = 2; // appears to be illegal despite foo3.arr == &foo3.a
printf("%d", foo3.b);
The standard appears to guarantee that foo3.arr is the same as &foo3.a, and it doesn't make sense that referring to it one way is legal and the other not, but equally it doesn't make sense that adding the outer union with the array should suddenly make (&foo3.a)[1] legal.
My reasoning for thinking the first examples must also therefore be legal:
foo3.arr is guaranteed to be the same as &foo.a
foo3.arr + 1 and &foo3.b point to the same memory location
&foo3.a + 1 and &foo3.b must therefore point to the same memory location (from 1 and 2)
struct layouts are required to be consistent, so &foo1.a and &foo1.b should be laid out exactly the same as &foo3.a and &foo3.b
&foo1.a + 1 and &foo1.b must therefore point to the same memory location (from 3 and 4)
I've come across some outside sources that suggest that both the foo3.arr[1] and (&foo3.a)[1] examples are illegal, however I haven't been able to find a concrete statement in the standard that would make it so.
Even if they were both illegal though, it's also possible to construct the same scenario with flexible array pointers which, as far as I can tell, does have standard-defined behavior.
union {
struct {
int x;
int arr[];
};
struct {
int y;
int a;
int b;
};
} foo4;
The original application is considering whether or not a buffer overflow from one struct field into another is strictly speaking defined by the standard:
struct {
char buffer[8];
char overflow[8];
} buf;
strcpy(buf.buffer, "Hello world!");
println(buf.overflow);
I would expect this to output "rld!" on nearly any real-world compiler, but is this behavior guaranteed by the standard, or is it an undefined or implementation-defined behavior?
Introduction: The standard is inadequate in this area, and there is decades of history of argument on this topic and strict aliasing with no convincing resolution or proposal to fix.
This answer reflects my view rather than any imposition of the Standard.
Firstly: it's generally agreed that the code in your first code sample is undefined behaviour due to accessing outside the bounds of an array via direct pointer arithmetic.
The rule is C11 6.5.6/8 . It says that indexing from a pointer must remain within "the array object" (or one past the end). It doesn't say which array object but it is generally agreed that in the case int *p = &foo.a; then "the array object" is foo.a, and not any larger object of which foo.a is a subobject.
Relevant links:
one, two.
Secondly: it's generally agreed that both of your union examples are correct. The standard explicitly says that any member of a union may be read; and whatever the contents of the relevant memory location are are interpreted as the type of the union member being read.
You suggest that the union being correct implies that the first code should be correct too, but it does not. The issue is not with specifying the memory location read; the issue is with how we arrived at the expression specifying that memory location.
Even though we know that &foo.a + 1 and &foo.b are the same memory address, it's valid to access an int through the second and not valid to access an int through the first.
It's generally agreed that you can access the int by computing its address in other ways that don't break the 6.5.6/8 rule, e.g.:
((int *)((char *)&foo + offsetof(foo, b))[0]
or
((int *)((uintptr_t)&foo.a + sizeof(int)))[0]
Relevant links: one, two
It's not generally agreed on whether ((int *)&foo)[1] is valid. Some say it's basically the same as your first code, since the standard says "a pointer to an object, suitably converted, points to the element's first object". Others say it's basically the same as my (char *) example above because it follows from the specification of pointer casting. A few even claim it's a strict aliasing violation because it aliases a struct as an array.
Maybe relevant is N2090 - Pointer provenance proposal. This does not directly address the issue, and doesn't propose a repeal of 6.5.6/8.
According to C11 draft N1570 6.5p7, an attempt to access the stored value of a struct or union object using anything other than an lvalue of character type, the struct or union type, or a containing struct or union type, invokes UB even if behavior would otherwise be fully described by other parts of the Standard. This section contains no provision that would allow an lvalue of a non-character member type (or any non-character numeric type, for that matter) to be used to access the stored value of a struct or union.
According to the published Rationale document, however, the authors of the Standard recognized that different implementations offered different behavioral guarantees in cases where the Standard imposed no requirements, and regarded such "popular extensions" as a good and useful thing. They judged that questions of when and how such extensions should be supported would be better answered by the marketplace than by the Committee. While it may seem weird that the Standard would allow an obtuse compiler to ignore the possibility that someStruct.array[i] might affect the stored value of someStruct, the authors of the Standard recognized that any compiler whose authors aren't deliberately obtuse will support such a construct whether the Standard mandates or not, and that any attempt to mandate any kind of useful behavior from obtusely-designed compilers would be futile.
Thus, a compiler's level of support for essentially anything having to do with structures or unions is a quality-of-implementation issue. Compiler writers who are focused on being compatible with a wide range of programs will support a wide range of constructs. Those which are focused on maximizing the performance of code that needs only those constructs without which the language would be totally useless, will support a much narrower set. The Standard, however, is devoid of guidance on such issues.
PS--Compilers that are configured to be compatible with MSVC-style volatile semantics will interpret that qualifier as a indicating that an access to the pointer may have side-effects that interact with objects whose address has been taken and that aren't guarded by restrict, whether or not there is any other reason to expect such a possibility. Use of such a qualifier when accessing storage in "unusual" ways may make it more obvious to human readers that the code is doing something "weird" at the same time as it will thus ensure compatibility with any compiler that uses such semantics, even if such compiler would not otherwise recognize that access pattern. Unfortunately, some compiler writers refuse to support such semantics at anything other than optimization level 0 except with programs that demand it using non-standard syntax.

Which is the correct way to transfer values between unions?

If you have:
typedef union value {
int i;
float f;
} VALUE;
VALUE a, b;
if you know the type of a, should you do
b.i = a.i;
b.f = a.f;
or
if(a_type == INT)
b.i = a.i;
if(a_type == FLOAT)
b.f = a.f;
Just use b = a unless there is a particular reason not to, such as that the union occasionally contains a large amount of data, and you want to optimize the assignment for cases when it contains only a small amount.
Per C 2011 [N1570] 6.5.16.1 1, one of the acceptable situations for simple assignment is:
the left operand has an atomic, qualified, or unqualified version of a structure or union type compatible with the type of the right.
Per 6.2.7 1:
Two types have compatible type if their types are the same.
(Per 6.2.6.1 6, “The value of a structure or union object is never a trap representation, even though the value of a member of the structure or union object may be a trap representation.”)
The most correct probably is (assuming the the same type)
memcpy(&b, &a, sizeof(b));
The Standard does not specify in what cases the storage associated with an aggregate may be accessed via a member-access lvalue of non-character type. If a union member member1 is the largest type in a union and has no trap representations, a quality compiler should be able to handle operations like
objectOfMemberType = someUnion.member1;
someUnion.member1 = valueOfMemberType;
A quality compiler should also have no problem with constructs like:
memberType *p = &someUnion.member1;
*p = valueOfMemberType;
As the Standard is written, it imposes no requirements upon the behavior of either construct if the member is of non-character type, and thus the only ways to do anything with unions without invoking UB are to either use lvalue of union type to copy the entire contents of a union, use lvalues of character type to accesses pieces of the union, or use functions like memcpy. Accessing union contents directly through member lvalues seems mostly reliable in gcc and clang (I think the cases I've found where it doesn't are unintentional bugs) but neither gcc nor clang seems to be designed to reliably support any use of union member pointers, even when the usage immediately follows the act of taking the address [as shown above].
Given the way the C Standard is written, it's hard to really determine a "correct" way of transferring values between unions, since most kinds of code which would employ unions in useful fashion invoke UB. Which approach is best depends upon which forms of UB one is willing to trust one's compiler to process sensibly.

Is this interpretation of union behavior accurate?

^^^ THIS QUESTION IS NOT ABOUT TYPE PUNNING ^^^
It is my understanding that an object contained in a union can only be used if it is active, and that it is active iff it was the last member to have a value stored to it. This suggests that the following code should be undefined at the points I mark.
My question is if I am correct in my understanding of when it is defined to access a member of a union, particularly in the following situations.
#include <stddef.h>
#include <stdio.h>
void work_with_ints(int* p, size_t k)
{
size_t i = 1;
for(;i<k;++i) p[i]=p[i-1];
}
void work_with_floats(float* p, size_t k)
{
size_t i = 1;
for(;i<k;++i) p[i]=p[i-1];
}
int main(void)
{
union{ int I[4]; float F[4]; } u;
// this is undefined because no member of the union was previously
// selected by storing a value to the union object
work_with_ints(u.I,4);
printf("%d %d %d %d\n",u.I[0],u.I[1],u.I[2],u.I[3]);
u.I[0]=1; u.I[1]=2; u.I[2]=3; u.I[3]=4;
// this is undefined because u currently stores an object of type int[4]
work_with_floats(u.F,4);
printf("%f %f %f %f\n",u.F[0],u.F[1],u.F[2],u.F[3]);
// this is defined because the assignment makes u store an object of
// type F[4], which is subsequently accessed
u.F[0]=42.0;
work_with_floats(u.F,4);
printf("%f %f %f %f\n",u.F[0],u.F[1],u.F[2],u.F[3]);
return 0;
}
Am I correct in the three items I have noted?
My actual example is not possible to use here due to size, but it was suggested in a comment that I extend this example to something compileable. I compiled and ran the above in both clang (-Weverything -std=c11) and gcc (-pedantic -std=c11). Each gave the following:
0 0 0 0
0.000000 0.000000 0.000000 0.000000
42.000000 42.000000 42.000000 42.000000
That seems appropriate, but that does not mean the code is compliant.
EDIT:
To clarify what the code is doing, I will point out the exact instances where the property I mention in the first paragraph is applied.
First, the contents of an uninitialized union are read and modified. This is undefined behavior, rather than unspecified with a potential for UB with traps, if the principle I mention in the first paragraph is true.
Second, the contents of a union are used with the type of an inactive union member. Again, this is undefined behavior, rather than unspecified with a potential for UB with traps, if the principle I mention in the first paragraph is true.
Third, the item just mentioned as "second" produces unspecified behavior with a potential for UB with traps, if first one element of the array contained in the inactive member is modified. This makes the whole array the active member, hence the change in definedness.
I am demonstrating the consequences of the principle in the first paragraph of this question, to show how that principle, if correct, affects the nature of the C standard. Consequent the significant effect on the nature of the standard in some circumstances, I am looking for help in determining if the principle I have stated is a correct understanding of the standard.
EDIT:
I think it may help to describe how I get from the standard the principle in the first paragraph above, and how one might disagree. Not much is said on the matter in the standard, so there has to be some filling in of the gaps no matter what.
The standard describes a union as holding one object at a time. This seems to suggest treating it like a structure containing one element. It seems that anything deviating from that interpretation deserves mention. That is how I get to the principle I have stated.
On the other hand, the discussion of effective type does not define the term "declared type". If that term is understood such that union members do not have a declared type, then it could be argued that each subobject of a union need be interpreted as another member recursively. So, in the last example in my code, all floating point array members would need to be initialized, not just the first.
The two examples I give of undefined behavior are important to me to resolve. However, the last example, which relates to the above paragraph, seems most crucial. I could really see an argument either way there.
EDIT:
This is not a type punning question. First, I am talking about writing to unions, not reading from them. Second, I am talking about the validity of doing these writes with a pointer rather than with the union type. This is very different from type punning issues.
This question is more related to strict aliasing than it is to type punning. You can not access memory however you want due to strict aliasing. This question deals with exactly how unions ease the constraints of strict aliasing on their members. It is not said they ever do that, but if they don't then you could never do something like the following.
union{int i} u; u.i=0; function_working_with_an_int_pointer (&u.i);
So, clearly, unions affect the application of strict aliasing rules in some cases. My question is to confirm that the line I have drawn according to my reading of the standard is correct.
an object contained in a union can only be used if it is active, and that it is active iff it was the last member to have a value stored to it.
The statement is false. The behavior is reliable and defined.
union {
unsigned char c [4];
long d;
} v;
v .d = 0xaabbccddL;
printf ("%x\n", v .c [2]);
It is completely acceptable to access the c member even though it was not the last assigned. On a little endian machine, it will definitely show bb and on a big endian machine, cc.

Can a "container_of" macro ever be strictly-conforming?

A commonly-used macro in the linux kernel (and other places) is container_of, which is (basically) defined as follows:
#define container_of(ptr, type, member) (((type) *)((char *)(ptr) - offsetof((type), (member))))
Which basically allows recovery of a "parent" structure given a pointer to one of its members:
struct foo {
char ch;
int bar;
};
...
struct foo f = ...
int *ptr = &f.bar; // 'ptr' points to the 'bar' member of 'struct foo' inside 'f'
struct foo *g = container_of(ptr, struct foo, bar);
// now, 'g' should point to 'f', i.e. 'g == &f'
However, it's not entirely clear whether the subtraction contained within container_of is considered undefined behavior.
On one hand, because bar inside struct foo is only a single integer, then only *ptr should be valid (as well as ptr + 1). Thus, the container_of effectively produces an expression like ptr - sizeof(int), which is undefined behavior (even without dereferencing).
On the other hand, §6.3.2.3 p.7 of the C standard states that converting a pointer to a different type and back again shall produce the same pointer. Therefore, "moving" a pointer to the middle of a struct foo object, then back to the beginning should produce the original pointer.
The main concern is the fact that implementations are allowed to check for out-of-bounds indexing at runtime. My interpretation of this and the aforementioned pointer equivalence requirement is that the bounds must be preserved across pointer casts (this includes pointer decay - otherwise, how could you use a pointer to iterate across an array?). Ergo, while ptr may only be an int pointer, and neither ptr - 1 nor *(ptr + 1) are valid, ptr should still have some notion of being in the middle of a structure, so that (char *)ptr - offsetof(struct foo, bar) is valid (even if the pointer is equal to ptr - 1 in practice).
Finally, I came across the fact that if you have something like:
int arr[5][5] = ...
int *p = &arr[0][0] + 5;
int *q = &arr[1][0];
while it's undefined behavior to dereference p, the pointer by itself is valid, and required to compare equal to q (see this question). This means that p and q compare the same, but can be different in some implementation-defined manner (such that only q can be dereferenced). This could mean that given the following:
// assume same 'struct foo' and 'f' declarations
char *p = (char *)&f.bar;
char *q = (char *)&f + offsetof(struct foo, bar);
p and q compare the same, but could have different boundaries associated with them, as the casts to (char *) come from pointers to incompatible types.
To sum it all up, the C standard isn't entirely clear about this type of behavior, and attempting to apply other parts of the standard (or, at least my interpretations of them) leads to conflicts. So, is it possible to define container_of in a strictly-conforming manner? If so, is the above definition correct?
This was discussed here after comments on my answer to this question.
TLDR
It is a matter of debate among language lawyers as to whether programs using container_of are strictly conforming, but pragmatists using the container_of idiom are in good company and are unlikely to run into issues running programs compiled with mainstream tool chains on mainstream hardware. In other words:
strictly conforming: debated
conforming: yes, for all practical purposes, in most situations
What can be said today
There is no language in the standard C17 standard that unambiguously requires support for the container_of idiom.
There are defect reports that suggest the standard intends to allow implementations room to forbid the container_of idiom by tracking "provenance" (i.e. the valid bounds) of objects along with pointers. However, these alone are not normative.
There is recent activity in the C memory object model study group that aims to provide more rigor to this and similar questions. See Clarifying the C memory object model - N2012 from 2016, Pointers are more abstract than you might expect from 2018, and A Provenance-aware Memory Object Model for C - N2676 from 2021.
Depending on when you read this, there may be newer documents available at the WG14 document log. Additionally, Peter Sewell collects related reference material here: https://www.cl.cam.ac.uk/~pes20/cerberus/. These documents do not change what a strictly conforming program is today (in 2021, for versions C17 and older), but they suggest that the answer may change in newer versions of the standard.
Background
What is the container_of idiom?
This code demonstrates the idiom by expanding the contents of the macro usually seen implementing the idiom:
#include <stddef.h>
struct foo {
long first;
short second;
};
void container_of_idiom(void) {
struct foo f;
char* b = (char*)&f.second; /* Line A */
b -= offsetof(struct foo, second); /* Line B */
struct foo* c = (struct foo*)b; /* Line C */
}
In the above case, a container_of macro would typically take a short* argument intended to point to the second field of a struct foo. It would also take arguments for struct foo and second, and would expand to an expression returning struct foo*. It would employ the logic seen in lines A-C above.
The question is: is this code strictly conforming?
First, let's define "strictly conforming"
C17 4 (5-7) Conformance
A strictly conforming program shall use only those features of the language and library specified in this International Standard. It shall not produce output dependent on any unspecified, undefined, or implementation-defined behavior, and shall not exceed any minimum implementation limit.
[...] A conforming hosted implementation shall accept any strictly conforming program. [...] A conforming implementation may have extensions (including additional library functions), provided they do not alter the behavior of any strictly conforming program.
A conforming program is one that is acceptable to a conforming implementation.
(For brevity I omitted the definition of "freestanding" implementations, as it concerns limitations on the standard library not relevant here.)
From this we see that strict conformance is quite strict, but a conforming implementation is allowed to define additional behavior as long as it does not alter the behavior of a strictly conforming program. In practice, almost all implementations do this; this is the "practical" definition that most C programs are written against.
For the purposes of this answer I'll contain my answer to strictly conforming programs, and talk about merely conforming programs at the end.
Defect reports
The language standard itself is somewhat unclear on the question, but several defect reports shed more light on the issue.
DR 51
DR 51 ask questions of this program:
#include <stdlib.h>
struct A {
char x[1];
};
int main() {
struct A *p = (struct A *)malloc(sizeof(struct A) + 100);
p->x[5] = '?'; /* This is the key line */
return p->x[5];
}
The response to the DR includes (emphasis mine):
Subclause 6.3.2.1 describes limitations on pointer arithmetic, in connection with array subscripting. (See also subclause 6.3.6.) Basically, it permits an implementation to tailor how it represents pointers to the size of the objects they point at. Thus, the expression p->x[5] may fail to designate the expected byte, even though the malloc call ensures that the byte is present. The idiom, while common, is not strictly conforming.
Here we have the first indication that the standard allows implementations to "tailor" pointer representations based on the objects pointed at, and that pointer arithmetic that "leaves" the valid range of the original object pointed to is not strictly conforming.
DR 72 ask questions of this program:
#include <stddef.h>
#include <stdlib.h>
typedef double T;
struct hacked {
int size;
T data[1];
};
struct hacked *f(void)
{
T *pt;
struct hacked *a;
char *pc;
a = malloc(sizeof(struct hacked) + 20 * sizeof(T));
if (a == NULL) return NULL;
a->size = 20;
/* Method 1 */
a->data[8] = 42; /* Line A /*
/* Method 2 */
pt = a->data;
pt += 8; /* Line B /*
*pt = 42;
/* Method 3 */
pc = (char *)a;
pc += offsetof(struct hacked, data);
pt = (T *)pc; /* Line C */
pt += 8; /* Line D */
*pt = 6 * 9;
return a;
}
Astute readers will notice that /* Method 3 */ above is much like the container_of idiom. I.e. it takes a pointer to a struct type, converts it to char*, does some pointer arithmetic that takes the char* outside the range of the original struct, and uses the pointer.
The committee responded by saying /* Line C */ was strictly conforming but /* Line D */ was not strictly conforming by the same argument given for DR 51 above. Further, the committee said that the answers "are not affected if T has char type."
Verdict: container_of is not strictly conforming (probably)
The container_of idiom takes a pointer to a struct's subobject, converts the pointer to char*, and performs pointer arithmetic that moves the pointer outside the subobject. This is the same set of operations discussed in DR 51 and 72 apply. There is clear intent on the part of the committee. They hold that the standard "permits an implementation to tailor how it represents pointers to the size of the objects they point at" and thus "the idiom, while common, is not strictly conforming."
One might argue that container_of side steps the issue by doing the pointer arithmetic in the domain of char* pointers, but the committee says the answer is "not affected if T has char type."
May the container_of idiom be used in practice?
No, if you want to be strict and use only code that is not clearly strictly conforming according to current language standards.
Yes, if you are a pragmatist and believe that an idiom widely used in Linux, FreeBSD, Microsoft Windows C code is enough to label the idiom conforming in practice.
As noted above, implementations are allowed to guarantee behavior in ways not required by the standard. On a practical note, the container_of idiom is used in the Linux kernel and many other projects. It is easy for implementations to support on modern hardware. Various "sanitizer" systems such as Address Sanitizer, Undefined Behavior Sanitizer, Purify, Valgrind, etc., all allow this behavior. On systems with flat address spaces, and even segmented ones, various "pointer games" are common (e.g. converting to integral values and masking off low order bits to find page boundaries, etc). These techniques are so common in C code today that it is very unlikely that such idioms will cease to function on any commonly supported system now or in the future.
In fact, I found one implementation of a bounds checker that gives a different interpretation of C semantics in its paper. The quotes are from the following paper: Richard W. M. Jones and Paul H. J. Kelly. Backwards-compatible bounds checking for arrays and pointers in C programs. In Third International Workshop on Automated Debugging (editors M. Kamkarand D. Byers), volume 2 (1997), No. 009 of Linköping Electronic Articles in Computer and Information Science. Linköping University Electronic Press, Linköping, Sweden. ISSN 1401-9841, May 1997 pp. 13–26. URL http://www.ep.liu.se/ea/cis/1997/009/02/
ANSI C conveniently allows us to define an object as the fundamental unit of memory allocation. [...] Operations are permitted which manipulate pointers within objects, but pointer operations are not permitted to cross between two objects. There is no ordering defined between objects, and the programmer should never be allowed to make assumptions about how objects are arranged in memory.
Bounds checking is not blocked or weakened by the use of a cast (i.e. type coercion). Cast can properly be used to change the type of the object to which a pointer refers, but cannot be used to turn a pointer to one object into a pointer to another. A corollary is that bounds checking is not type checking: it does not prevent storage from being declared with one data structure and used with another. More subtly, note that for this reason, bounds checking in C cannot easily validate use of arrays of structs which contain arrays in turn.
Every valid pointer-valued expression in C derives its result from exactly one original storage object. If the result of the pointer calculation refers to a different object, it is invalid.
This language is quite definitive but take note the paper was published in 1997, before the DR reports above were written and responded to. The best way to interpret the bounds checking system described in the paper is as a conforming implementation of C, but not one that detects all non-strictly conforming programs. I do see similarities between this paper and A Provenance-aware Memory Object Model for C - N2676 from 2021, however, so in the future the ideas similar to the ones quoted above might be codified in the language standard.
The C memory object model study group is a treasure trove of discussions related to container_of and many other closely related problems. From their mailing list archive we have these mentions of the container_of idiom:
2.5.4 Q34 Can one move among the members of a struct using representation-pointer arithmetic and casts?
The standard is ambiguous on the interaction between the allowable pointer arithmetic (on unsigned char* representation pointers) and subobjects. For example, consider:
Example cast_struct_inter_member_1.c
#include <stdio.h>
#include <stddef.h>
typedef struct { float f; int i; } st;
int main() {
st s = {.f=1.0, .i=1};
int *pi = &(s.i);
unsigned char *pci = ((unsigned char *)pi);
unsigned char *pcf = (pci - offsetof(st,i))
+ offsetof(st,f);
float *pf = (float *)pcf;
*pf = 2.0; // is this free of undefined behaviour?
printf("s.f=%f *pf=%f s.i=%i\n",s.f,*pf,s.i);
}
This forms an unsigned char* pointer to the second member (i) of a struct, does arithmetic on that using offsetof to form an unsigned char* pointer to the first member, casts that into a pointer to the type of the first member (f), and uses that to write.
In practice we believe that this is all supported by most compilers and it is used in practice, e.g. as in the Container idiom of Chisnall et al. [ASPLOS 2015], where they discuss container macros that take a pointer to a structure member and compute a pointer to the structure as a whole. They see it heavily used by one of the example programs they studied. We are told that Intel's MPX compiler does not support the container macro idiom, while Linux, FreeBSD, and Windows all rely on it.
The standard says (6.3.2.3p7): "...When a pointer to an object is converted to a pointer to a character type, the result points to the lowest addressed byte of the object. Successive increments of the result, up to the size of the object, yield pointers to the remaining bytes of the object.". This licenses the construction of the unsigned char* pointer pci to the start of the representation of s.i (presuming that a structure member is itself an "object", which itself is ambiguous in the standard), but allows it to be used only to access the representation of s.i.
The offsetof definition in stddef.h, 7.19p3, " offsetof(type,member-designator) which expands to an integer constant expression that has type size_t, the value of which is the offset in bytes, to the structure member (designated by member-designator, from the beginning of its structure (designated by type", implies that the calculation of pcf gets the correct numerical address, but does not say that it can be used, e.g. to access the representation of s.f. As we saw in the discussion of provenance, in a post-DR260 world, the mere fact that a pointer has the correct address does not necessarily mean that it can be used to access that memory without giving rise to undefined behaviour.
Finally, if one deems pcf to be a legitimate char* pointer to the representation of s.f, then the standard says that it can be converted to a pointer to any object type if sufficiently aligned, which for float* it will be. 6.3.2.3p7: "A pointer to an object type may be converted to a pointer to a different object type. If the resulting pointer is not correctly aligned (68) for the referenced type, the behavior is undefined. Otherwise, when converted back again, the result shall compare equal to the original pointer....". But whether that pointer has the right value and is usable to access memory is left unclear.
This example should be allowed in our de facto semantics but is not clearly allowed in the ISO text.
What needs to be changed in the ISO text to clarify this?
More generally, the ISO text's use of "object" is unclear: does it refer to an allocation, or are struct members, union members, and array elements also "objects"?
Key phrase being "This example should be allowed in our de facto semantics but is not clearly allowed in the ISO text." i.e. I take this to mean the the group documenets like N2676 wish to see container_of supported.
However, in a later message:
2.2 Provenance and subobjects: container-of casts
A key question is whether one can cast from a pointer to the first member of a struct to the struct as a whole, and then use that to access other members. We discussed it previously in N2222 Q34 Can one move among the members of a struct using representation-pointer arithmetic and casts?, N2222 Q37 Are usable pointers to a struct and to its first member interconvertable?, N2013, and N2012. Some of us had thought that that was uncontroversially allowed in ISO C, by 6.7.2.1p15 ...A pointer to a structure object, suitably converted, points to its initial member..., and vice versa..., but others disagree. In practice, this seems to be common in real code, in the "container-of" idiom.
Though someone suggested that the IBM XL C/C++ compiler does not support it. Clarification from WG14 and compiler teams would be very helpful on this point.
With this, the group sums it up nicely: the idiom is widely used, but there is disagreement about what the standard says about it.
I think its strictly conforming or there's a big defect in the standard. Referring to your last example, the section on pointer arithmetic doesn't give the compiler any leeway to treat p and q differently. It isn't conditional on how the pointer value was obtained, only what object it points to.
Any interpretation that p and q could be treated differently in pointer arithmetic would require an interpretation that p and q do not point to the same object. Since since there's no implementation dependent behaviour in how you obtained p and q then that would mean they don't point to the same object on any implementation. That would in turn require that p == q be false on all implementations, and so would make all actual implementations non-conforming.
I just want to answer this bit.
int arr[5][5] = ...
int *p = &arr[0][0] + 5;
int *q = &arr[1][0];
This is not UB. It is certain that p is a pointer to an element of the array, provided only that it is within bounds. In each case it points to the 6th element of a 25 element array, and can safely be dereferenced. It can also be incremented or decremented to access other elements of the array.
See n3797 S8.3.4 for C++. The wording is different for C, but the meaning is the same. In effect arrays have a standard layout and are well-behaved with respect to pointers.
Let us suppose for a moment that this is not so. What are the implications? We know that the layout of an array int[5][5] is identical to int[25], there can be no padding, alignment or other extraneous information. We also know that once p and q have been formed and given a value, they must be identical in every respect.
The only possibility is that, if the standard says it is UB and the compiler writer implements the standard, then a sufficiently vigilant compiler might either (a) issue a diagnostic based on analysing the data values or (b) apply an optimisation which was dependent on not straying outside the bounds of sub-arrays.
Somewhat reluctantly I have to admit that (b) is at least a possibility. I am led to the rather strange observation that if you can conceal from the compiler your true intentions this code is guaranteed to produce defined behaviour, but if you do it out in the open it may not.

What does C standard say about a union of two identical types

Is it possible to to have the following assert fail with any compiler on any architecture?
union { int x; int y; } u;
u.x = 19;
assert(u.x == u.y);
C99 Makes a special guarantee for a case when two members of a union are structures that share an initial sequence of fields:
struct X {int a; /* other fields may follow */ };
struct Y {int a; /* other fields may follow */ };
union {X x; Y y;} u;
u.x.a = 19;
assert(u.x.a == u.y.a); // Guaranteed never to fail by 6.5.2.3-5.
6.5.2.3-5 : One special guarantee is made in order to simplify the use of unions: if a union contains
several structures that share a common initial sequence (see below), and if the union
object currently contains one of these structures, it is permitted to inspect the common
initial part of any of them anywhere that a declaration of the complete type of the union is
visible. Two structures share a common initial sequence if corresponding members have
compatible types (and, for bit-fields, the same widths) for a sequence of one or more
initial members.
However, I was unable to find a comparable guarantee for non-structured types inside a union. This may very well be an omission, though: if the standard goes some length to describe what must happen with structured types that are not even the same, it should have clarified the same point for simpler, non-structured types.
The assert in the problem will never fail in an implementation of standard C because accessing u.y after an assignment to u.x is required to reinterpret the bytes of u.x as the type of u.y. Since the types are the same, the reinterpretation produces the same value.
This requirement is noted in C 2011 (N1570) 6.5.2.3 note 95, which indicates it derives from clause 6.2.6, which covers the representations of types. Note 95 says:
If the member used to read the contents of a union object is not the same as the member last used to store a value in the object, the appropriate part of the object representation of the value is reinterpreted as an object representation in the new type as described in 6.2.6 (a process sometimes called ‘‘type punning’’). This might be a trap representation.
(N1570 is an unofficial draft but is readily available on the net.)
I believe this question is very hard to answer in the manner you seem to expect.
As far as I know, reading one field of a union that is not the one that was most recently wwritten to, is undefined behavior.
Thus, it's impossible to answer with "no", since any compiler writer is free to specifically detect this and make it fail just out of spite, if they feel like it.

Resources