Related
Just curious, what actually happens if I define a zero-length array int array[0]; in code? GCC doesn't complain at all.
Sample Program
#include <stdio.h>
int main() {
int arr[0];
return 0;
}
Clarification
I'm actually trying to figure out if zero-length arrays initialised this way, instead of being pointed at like the variable length in Darhazer's comments, are optimised out or not.
This is because I have to release some code out into the wild, so I'm trying to figure out if I have to handle cases where the SIZE is defined as 0, which happens in some code with a statically defined int array[SIZE];
I was actually surprised that GCC does not complain, which led to my question. From the answers I've received, I believe the lack of a warning is largely due to supporting old code which has not been updated with the new [] syntax.
Because I was mainly wondering about the error, I am tagging Lundin's answer as correct (Nawaz's was first, but it wasn't as complete) -- the others were pointing out its actual use for tail-padded structures, while relevant, isn't exactly what I was looking for.
An array cannot have zero size.
ISO 9899:2011 6.7.6.2:
If the expression is a constant expression, it shall have a value greater than zero.
The above text is true both for a plain array (paragraph 1). For a VLA (variable length array), the behavior is undefined if the expression's value is less than or equal to zero (paragraph 5). This is normative text in the C standard. A compiler is not allowed to implement it differently.
gcc -std=c99 -pedantic gives a warning for the non-VLA case.
As per the standard, it is not allowed.
However it's been current practice in C compilers to treat those declarations as a flexible array member (FAM) declaration:
C99 6.7.2.1, §16: As a special case, the last element of a structure with more than one named member may have an incomplete array type; this is called a flexible array member.
The standard syntax of a FAM is:
struct Array {
size_t size;
int content[];
};
The idea is that you would then allocate it so:
void foo(size_t x) {
Array* array = malloc(sizeof(size_t) + x * sizeof(int));
array->size = x;
for (size_t i = 0; i != x; ++i) {
array->content[i] = 0;
}
}
You might also use it statically (gcc extension):
Array a = { 3, { 1, 2, 3 } };
This is also known as tail-padded structures (this term predates the publication of the C99 Standard) or struct hack (thanks to Joe Wreschnig for pointing it out).
However this syntax was standardized (and the effects guaranteed) only lately in C99. Before a constant size was necessary.
1 was the portable way to go, though it was rather strange.
0 was better at indicating intent, but not legal as far as the Standard was concerned and supported as an extension by some compilers (including gcc).
The tail padding practice, however, relies on the fact that storage is available (careful malloc) so is not suited to stack usage in general.
In Standard C and C++, zero-size array is not allowed..
If you're using GCC, compile it with -pedantic option. It will give warning, saying:
zero.c:3:6: warning: ISO C forbids zero-size array 'a' [-pedantic]
In case of C++, it gives similar warning.
It's totally illegal, and always has been, but a lot of compilers
neglect to signal the error. I'm not sure why you want to do this.
The one use I know of is to trigger a compile time error from a boolean:
char someCondition[ condition ];
If condition is a false, then I get a compile time error. Because
compilers do allow this, however, I've taken to using:
char someCondition[ 2 * condition - 1 ];
This gives a size of either 1 or -1, and I've never found a compiler
which would accept a size of -1.
Another use of zero-length arrays is for making variable-length object (pre-C99). Zero-length arrays are different from flexible arrays which have [] without 0.
Quoted from gcc doc:
Zero-length arrays are allowed in GNU C. They are very useful as the last element of a structure that is really a header for a variable-length object:
struct line {
int length;
char contents[0];
};
struct line *thisline = (struct line *)
malloc (sizeof (struct line) + this_length);
thisline->length = this_length;
In ISO C99, you would use a flexible array member, which is slightly different in syntax and semantics:
Flexible array members are written as contents[] without the 0.
Flexible array members have incomplete type, and so the sizeof operator may not be applied.
A real-world example is zero-length arrays of struct kdbus_item in kdbus.h (a Linux kernel module).
I'll add that there is a whole page of the online documentation of gcc on this argument.
Some quotes:
Zero-length arrays are allowed in GNU C.
In ISO C90, you would have to give contents a length of 1
and
GCC versions before 3.0 allowed zero-length arrays to be statically initialized, as if they were flexible arrays. In addition to those cases that were useful, it also allowed initializations in situations that would corrupt later data
so you could
int arr[0] = { 1 };
and boom :-)
Zero-size array declarations within structs would be useful if they were allowed, and if the semantics were such that (1) they would force alignment but otherwise not allocate any space, and (2) indexing the array would be considered defined behavior in the case where the resulting pointer would be within the same block of memory as the struct. Such behavior was never permitted by any C standard, but some older compilers allowed it before it became standard for compilers to allow incomplete array declarations with empty brackets.
The struct hack, as commonly implemented using an array of size 1, is dodgy and I don't think there's any requirement that compilers refrain from breaking it. For example, I would expect that if a compiler sees int a[1], it would be within its rights to regard a[i] as a[0]. If someone tries to work around the alignment issues of the struct hack via something like
typedef struct {
uint32_t size;
uint8_t data[4]; // Use four, to avoid having padding throw off the size of the struct
}
a compiler might get clever and assume the array size really is four:
; As written
foo = myStruct->data[i];
; As interpreted (assuming little-endian hardware)
foo = ((*(uint32_t*)myStruct->data) >> (i << 3)) & 0xFF;
Such an optimization might be reasonable, especially if myStruct->data could be loaded into a register in the same operation as myStruct->size. I know nothing in the standard that would forbid such optimization, though of course it would break any code which might expect to access stuff beyond the fourth element.
Definitely you can't have zero sized arrays by standard, but actually every most popular compiler gives you to do that. So I will try to explain why it can be bad
#include <cstdio>
int main() {
struct A {
A() {
printf("A()\n");
}
~A() {
printf("~A()\n");
}
int empty[0];
};
A vals[3];
}
I am like a human would expect such output:
A()
A()
A()
~A()
~A()
~A()
Clang prints this:
A()
~A()
GCC prints this:
A()
A()
A()
It is totally strange, so it is a good reason not to use empty arrays in C++ if you can.
Also there is extension in GNU C, which gives you to create zero length array in C, but as I understand it right, there should be at least one member in structure prior, or you will get very strange examples as above if you use C++.
Consider the following simple example:
struct __attribute__ ((__packed__)) {
int code[1];
int place_holder[100];
} s;
void test(int n)
{
int i;
for (i = 0; i < n; i++) {
s.code[i] = 1;
}
}
The for-loop is writing to the field code, which is of size 1. The next field after code is place_holder.
I would expect that in case of n > 1, the write to code array would overflow and 1 would be written to place_holder.
However, when compiling with -O2 (on gcc 4.9.4 but probably on other versions as well) something interesting happens.
The compiler identifies that the code might overflow array code, and limits loop unrolling to 1 iteration.
It's easy to see that when compiling with -fdump-tree-all and looking at the last tree pass ("t.optimized"):
;; Function test (test, funcdef_no=0, decl_uid=1366, symbol_order=1)
Removing basic block 5
test (int n)
{
<bb 2>:
# DEBUG i => 0
# DEBUG i => 0
if (n_4(D) > 0)
goto <bb 3>;
else
goto <bb 4>;
<bb 3>:
s.code[0] = 1;
# DEBUG i => 1
# DEBUG i => 1
<bb 4>:
return;
}
So in this case the compiler completely unrolled the loop to a single iteration.
My questions are:
From C specification viewpoint, is overflowing (deliberately) from one struct member to the next is illegal or undefined behavior?
Let's assume I'm aware of the struct layout in memory and know what I'm doing when deliberately overflowing the code array.
Is there a way to prevent gcc from unrolling the loop in such case? I know I can completely prevent loop unrolling, however I'm still interested in loop unrolling on other cases. I also suspect that the analysis the compiler is doing might affect passes other than loop unrolling.
gcc is assuming I'm not going to overflow when accessing my array, so what I'm really looking for is way to tell the compiler not to take this assumption (by providing some compiler option).
I'm aware it's a bad practice to write such code that overflows from one field to another, and I'm not intending to write such code.
I'm also aware of the practice to put an array (possibly zero sized) as the last struct field to allow it to overflow, this is well supported by compilers, while in this case the array code is not the last field.
So this is not a question of "how to fix the code", but rather a question of understanding the compiler assumptions and affecting them.
These questions came up when I observed existing code that was already written in such way, and debugged it to find out why it's not behaving as the original developer expected it to behave.
The risk is that there are other places in the code where such problem exists. Static analysis tools can help to find out, but I would also like to know if there's a way to make the compiler tolerate such code and still generate the result we would expect.
Update
I got clear answer to question (1) above, but not for question (2).
Can gcc allow this as an extension, by some compile options?
Is there a way to at least get a warning when gcc identifies it? (and it clearly identifies it, by optimizing things out).
That's important in order to identify such cases in a large existing code base.
From C specification viewpoint, is overflowing (deliberately) from one struct member to the next is illegal or undefined behavior?
It is undefined behavior. The arr[i] operator is syntactic sugar around *(arr + i). So array access boils down to the binary + operator for pointer arithmetic, C17 6.5.6 additive operators, from §7 and §8:
For the purposes of these operators, a pointer to an object that is not an element of an
array behaves the same as a pointer to the first element of an array of length one with the
type of the object as its element type.
When an expression that has integer type is added to or subtracted from a pointer, the
result has the type of the pointer operand. /--/
If both the pointer
operand and the result point to elements of the same array object, or one past the last
element of the array object, the evaluation shall not produce an overflow; otherwise, the
behavior is undefined.
If the result points one past the last element of the array object, it
shall not be used as the operand of a unary * operator that is evaluated.
As you noticed, optimizing compilers might exploit these rules to produce faster code.
Is there a way to prevent gcc from unrolling the loop in such case?
There is a a special exception rule that can be used, C17 6.3.2.3/7:
When a pointer to an object is converted to a pointer to a character type,
the result points to the lowest addressed byte of the object. Successive increments of the
result, up to the size of the object, yield pointers to the remaining bytes of the object.
Also, strict aliasing does not apply to character types, because of another special rule in C17 6.5 §7
An object shall have its stored value accessed only by an lvalue expression that has one of
the following types: ... a character type.
These two special rules co-exist in harmony. So assuming we don't mess up alignment etc during the pointer conversion, this means that we are allowed to do this:
unsigned char* i;
for(i = (unsigned char*)&mystruct; i < (unsigned char*)(&mystruct + 1); i++)
{
do_something(*i);
}
This may however read padding bytes etc so it's "implementation-defined". But in theory you can access the struct byte per byte, and as long as the struct offsets are calculated on byte-per-byte basis, you can iterate across multiple members of the struct (or any other object) in this manner.
As far as I can tell, this very questionable-looking code should be well-defined:
#include <stdint.h>
#include <string.h>
#include <stdio.h>
struct __attribute__ ((__packed__)) {
int code[1];
int place_holder[100];
} s;
void test(int val, int n)
{
for (unsigned char* i = (unsigned char*)&s;
i < (unsigned char*)&s + n*sizeof(int);
i += _Alignof(int))
{
if((uintptr_t)i % _Alignof(int) == 0) // not really necessary, just defensive prog.
{
memcpy(i, &val, sizeof(int));
printf("Writing %d to address %p\n", val, (void*)i);
}
}
}
int main (void)
{
test(42, 3);
printf("%d %d %d\n", s.code[0], s.place_holder[0], s.place_holder[1]);
}
This works fine on gcc and clang (x86). How efficient it is, well that's another story. Please don't write code like this, though.
From C specification viewpoint, is overflowing (deliberately) from one struct member to the next is illegal or undefined behavior?
It's undefined behavior to access an array out-of-bounds. From C11 J.2:
The behavior is undefined in the following circumstances:
[...]
An array subscript is out of range [...]
Is there a way to prevent gcc from unrolling the loop in such case?
Alias code with a volatile pointer. But even using an intermediary pointer seems to work. godbolt link
Just _Static_assert the layout and do the pointer arithmetic in (char*), then cast to (int*) and
do the access. No further tricks such as memcpy/_Alignof are required because ints are unpadded
and you are accessing ints where there really are ints.
This alone makes gcc unroll the loop.
Why character-pointer based (char*, signed char*, unsigned char*) pointer arithmetic is required is because
http://port70.net/~nsz/c/c11/n1570.html#J.2 (non-normatively, as it is just an appendix, but gcc seems to follow it) makes out-of bounds accesses UB,
but http://port70.net/~nsz/c/c99/n1256.html#6.2.6.1p4 and http://port70.net/~nsz/c/c99/n1256.html#6.5p6 still allow inspecting any object via character pointers (more discussion on this at Is accessing an element of a multidimensional array out of bounds undefined behavior?).
Alternatively you could do the pointer arithmetic via uintptr_t (then it will be implementation defined)
but gcc optimizes those worse in certain cases (gcc doesn't fold (uintptr_t)p < (uintptr_t)(p+10) into true, but it does so for (char*)p < (char*)(p+10). This could be considered a missed optimization).
struct __attribute__ ((__packed__)) s {
int code[1];
int place_holder[100];
} s;
void test_s(int n) //original
{
int i;
for (i = 0; i < n; i++) {
s.code[i] = 1;
}
}
#include <stddef.h> //offsetof
void test_s2(int n) //unrolls the loop
{
_Static_assert(offsetof(struct s,code)+sizeof(int)==offsetof(struct s,place_holder),"");
//^will practically hold even without __attribute__((__packed__))
int i; for (i = 0; i < n; i++)
*(int*)((char*)&s.code + (size_t)i*sizeof(s.code[0])) = 1;
}
/////////////
//same code as test_s2
struct r {
int code101[101];
} r;
void test_r(int n)
{
int i;
for (i = 0; i < n; i++) {
r.code101[i] = 1;
}
}
1. Question:
"From C specification viewpoint, is overflowing (deliberately) from one struct member to the next illegal or undefined behavior?"
It is undefined behavior. The C standard states (emphasize mine):
"A postfix expression followed by an expression in square brackets [] is a subscripted designation of an element of an array object. The definition of the subscript operator [] is that E1[E2] is identical to (*((E1)+(E2))). Because of the conversion rules that apply to the binary + operator, if E1 is an array object (equivalently, a pointer to the initial element of an array object) and E2 is an integer, E1[E2] designates the E2-th element of E1 (counting from zero)."
Source: ISO/IEC 9899:2018 (C18), §6.5.2.1/2
"When an expression that has integer type is added to or subtracted from a pointer, the result has the type of the pointer operand. If the pointer operand points to an element of an array object, and the array is large enough, the result points to an element offset from the original element such that the difference of the subscripts of the resulting and original array elements equals the integer expression. In other words, if the expression P points to the i-th element of an array object, the expressions (P) + N (equivalently, N + (P)) and (P) - N (where N has the value n) point to, respectively, the i+n-th and i−n-th elements of the array object, provided they exist. Moreover, if the expression P points to the last element of an array object, the expression (P) + 1 points one past the last element of the array object, and if the expression Q points one past the last element of an array object, the expression (Q) - 1 points to the last element of the array object. If both the pointer operand and the result point to elements of the same array object, or one past the last element of the array object, the evaluation shall not produce an overflow; otherwise, the behavior is undefined. If the result points one past the last element of the array object, it shall not be used as the operand of a unary * operator that is evaluated."
Source: ISO/IEC 9899:2018 (C18), §6.5.6/8
Also non-normative Annex J states with regard to paragraph §6.5.6 in the normative standard:
J.2 Undefined behavior
1 The behavior is undefined in the following circumstances:
....
An array subscript is out of range, even if an object is apparently accessible with the given subscript (as in the lvalue expression a[1][7] given the declaration int a[4][5]) (6.5.6).
2. Question (plus update):
"Is there a way to prevent gcc from unrolling the loop in such case?"
"Can gcc allow this as an extension, by some compile options?"
"Is there a way to at least get a warning when gcc identifies it? That's important in order to identify such cases in a large existing code base."
You could try to place an empty assembly code function like asm(""); into the loop, as shown in this answer by Denilson Sá Maia, f.e.:
for (i = 0; i < n; i++) {
s.code[i] = 1;
asm("");
}
or #pragma's around the test function, as shown here, f.e.:
#pragma GCC push_options
#pragma GCC optimize ("O0")
void test(int n)
{
int i;
for (i = 0; i < n; i++) {
s.code[i] = 1;
}
}
#pragma GCC pop_options
to prevent the optimization for that specific program part in general and with that the loop unrolling.
Related:
How to prevent gcc optimizing some statements in C?
How to prevent GCC from optimizing out a busy wait loop?
Is there a way to tell GCC not to optimise a particular piece of code?
It is not preventing the loop unrolling, but you can use AddressSanitizer, which also got LeakSanitizer integrated, and is built into GCC since version 4.8 to detect when the loop unrolling doesn't work/you access non-affiliated memory.
More information about this, you can find here.
Edit: As you said your target implementation is MIPS, you can still use Valgrind to detect memory leaks.
In the language Dennis Ritchie described in 1974, the behavior of struct member access operators and pointer arithmetic were defined in terms of machine addresses, and except for the use of object size to scale pointer arithmetic, were agnostic as to the types of objects the addresses represented. The C Standard allows implementations to behave in that fashion when their customers would find it useful, but would also allow them to do other things, such as trapping out-of-bounds array accesses, if customers would find those other behaviors more useful.
Although later C dialects effectively behaved as though struct member names are prefixed by the struct name, so as to give each structure type its own member namespace, in most other respects compilers can be configured, by disabling optimizations if nothing else, to behave in a fashion consistent with Ritchie's 1974 language. Unfortunately, there's no way to distinguish implementations that will consistently behave in that fashion from those that won't; some compilers, especially those which go back to a time before the Standard, don't explicitly document that they support the 1974 behaviors because they were written at a time when compilers were generally expected to do so unless they documented otherwise.
A commonly-used macro in the linux kernel (and other places) is container_of, which is (basically) defined as follows:
#define container_of(ptr, type, member) (((type) *)((char *)(ptr) - offsetof((type), (member))))
Which basically allows recovery of a "parent" structure given a pointer to one of its members:
struct foo {
char ch;
int bar;
};
...
struct foo f = ...
int *ptr = &f.bar; // 'ptr' points to the 'bar' member of 'struct foo' inside 'f'
struct foo *g = container_of(ptr, struct foo, bar);
// now, 'g' should point to 'f', i.e. 'g == &f'
However, it's not entirely clear whether the subtraction contained within container_of is considered undefined behavior.
On one hand, because bar inside struct foo is only a single integer, then only *ptr should be valid (as well as ptr + 1). Thus, the container_of effectively produces an expression like ptr - sizeof(int), which is undefined behavior (even without dereferencing).
On the other hand, §6.3.2.3 p.7 of the C standard states that converting a pointer to a different type and back again shall produce the same pointer. Therefore, "moving" a pointer to the middle of a struct foo object, then back to the beginning should produce the original pointer.
The main concern is the fact that implementations are allowed to check for out-of-bounds indexing at runtime. My interpretation of this and the aforementioned pointer equivalence requirement is that the bounds must be preserved across pointer casts (this includes pointer decay - otherwise, how could you use a pointer to iterate across an array?). Ergo, while ptr may only be an int pointer, and neither ptr - 1 nor *(ptr + 1) are valid, ptr should still have some notion of being in the middle of a structure, so that (char *)ptr - offsetof(struct foo, bar) is valid (even if the pointer is equal to ptr - 1 in practice).
Finally, I came across the fact that if you have something like:
int arr[5][5] = ...
int *p = &arr[0][0] + 5;
int *q = &arr[1][0];
while it's undefined behavior to dereference p, the pointer by itself is valid, and required to compare equal to q (see this question). This means that p and q compare the same, but can be different in some implementation-defined manner (such that only q can be dereferenced). This could mean that given the following:
// assume same 'struct foo' and 'f' declarations
char *p = (char *)&f.bar;
char *q = (char *)&f + offsetof(struct foo, bar);
p and q compare the same, but could have different boundaries associated with them, as the casts to (char *) come from pointers to incompatible types.
To sum it all up, the C standard isn't entirely clear about this type of behavior, and attempting to apply other parts of the standard (or, at least my interpretations of them) leads to conflicts. So, is it possible to define container_of in a strictly-conforming manner? If so, is the above definition correct?
This was discussed here after comments on my answer to this question.
TLDR
It is a matter of debate among language lawyers as to whether programs using container_of are strictly conforming, but pragmatists using the container_of idiom are in good company and are unlikely to run into issues running programs compiled with mainstream tool chains on mainstream hardware. In other words:
strictly conforming: debated
conforming: yes, for all practical purposes, in most situations
What can be said today
There is no language in the standard C17 standard that unambiguously requires support for the container_of idiom.
There are defect reports that suggest the standard intends to allow implementations room to forbid the container_of idiom by tracking "provenance" (i.e. the valid bounds) of objects along with pointers. However, these alone are not normative.
There is recent activity in the C memory object model study group that aims to provide more rigor to this and similar questions. See Clarifying the C memory object model - N2012 from 2016, Pointers are more abstract than you might expect from 2018, and A Provenance-aware Memory Object Model for C - N2676 from 2021.
Depending on when you read this, there may be newer documents available at the WG14 document log. Additionally, Peter Sewell collects related reference material here: https://www.cl.cam.ac.uk/~pes20/cerberus/. These documents do not change what a strictly conforming program is today (in 2021, for versions C17 and older), but they suggest that the answer may change in newer versions of the standard.
Background
What is the container_of idiom?
This code demonstrates the idiom by expanding the contents of the macro usually seen implementing the idiom:
#include <stddef.h>
struct foo {
long first;
short second;
};
void container_of_idiom(void) {
struct foo f;
char* b = (char*)&f.second; /* Line A */
b -= offsetof(struct foo, second); /* Line B */
struct foo* c = (struct foo*)b; /* Line C */
}
In the above case, a container_of macro would typically take a short* argument intended to point to the second field of a struct foo. It would also take arguments for struct foo and second, and would expand to an expression returning struct foo*. It would employ the logic seen in lines A-C above.
The question is: is this code strictly conforming?
First, let's define "strictly conforming"
C17 4 (5-7) Conformance
A strictly conforming program shall use only those features of the language and library specified in this International Standard. It shall not produce output dependent on any unspecified, undefined, or implementation-defined behavior, and shall not exceed any minimum implementation limit.
[...] A conforming hosted implementation shall accept any strictly conforming program. [...] A conforming implementation may have extensions (including additional library functions), provided they do not alter the behavior of any strictly conforming program.
A conforming program is one that is acceptable to a conforming implementation.
(For brevity I omitted the definition of "freestanding" implementations, as it concerns limitations on the standard library not relevant here.)
From this we see that strict conformance is quite strict, but a conforming implementation is allowed to define additional behavior as long as it does not alter the behavior of a strictly conforming program. In practice, almost all implementations do this; this is the "practical" definition that most C programs are written against.
For the purposes of this answer I'll contain my answer to strictly conforming programs, and talk about merely conforming programs at the end.
Defect reports
The language standard itself is somewhat unclear on the question, but several defect reports shed more light on the issue.
DR 51
DR 51 ask questions of this program:
#include <stdlib.h>
struct A {
char x[1];
};
int main() {
struct A *p = (struct A *)malloc(sizeof(struct A) + 100);
p->x[5] = '?'; /* This is the key line */
return p->x[5];
}
The response to the DR includes (emphasis mine):
Subclause 6.3.2.1 describes limitations on pointer arithmetic, in connection with array subscripting. (See also subclause 6.3.6.) Basically, it permits an implementation to tailor how it represents pointers to the size of the objects they point at. Thus, the expression p->x[5] may fail to designate the expected byte, even though the malloc call ensures that the byte is present. The idiom, while common, is not strictly conforming.
Here we have the first indication that the standard allows implementations to "tailor" pointer representations based on the objects pointed at, and that pointer arithmetic that "leaves" the valid range of the original object pointed to is not strictly conforming.
DR 72 ask questions of this program:
#include <stddef.h>
#include <stdlib.h>
typedef double T;
struct hacked {
int size;
T data[1];
};
struct hacked *f(void)
{
T *pt;
struct hacked *a;
char *pc;
a = malloc(sizeof(struct hacked) + 20 * sizeof(T));
if (a == NULL) return NULL;
a->size = 20;
/* Method 1 */
a->data[8] = 42; /* Line A /*
/* Method 2 */
pt = a->data;
pt += 8; /* Line B /*
*pt = 42;
/* Method 3 */
pc = (char *)a;
pc += offsetof(struct hacked, data);
pt = (T *)pc; /* Line C */
pt += 8; /* Line D */
*pt = 6 * 9;
return a;
}
Astute readers will notice that /* Method 3 */ above is much like the container_of idiom. I.e. it takes a pointer to a struct type, converts it to char*, does some pointer arithmetic that takes the char* outside the range of the original struct, and uses the pointer.
The committee responded by saying /* Line C */ was strictly conforming but /* Line D */ was not strictly conforming by the same argument given for DR 51 above. Further, the committee said that the answers "are not affected if T has char type."
Verdict: container_of is not strictly conforming (probably)
The container_of idiom takes a pointer to a struct's subobject, converts the pointer to char*, and performs pointer arithmetic that moves the pointer outside the subobject. This is the same set of operations discussed in DR 51 and 72 apply. There is clear intent on the part of the committee. They hold that the standard "permits an implementation to tailor how it represents pointers to the size of the objects they point at" and thus "the idiom, while common, is not strictly conforming."
One might argue that container_of side steps the issue by doing the pointer arithmetic in the domain of char* pointers, but the committee says the answer is "not affected if T has char type."
May the container_of idiom be used in practice?
No, if you want to be strict and use only code that is not clearly strictly conforming according to current language standards.
Yes, if you are a pragmatist and believe that an idiom widely used in Linux, FreeBSD, Microsoft Windows C code is enough to label the idiom conforming in practice.
As noted above, implementations are allowed to guarantee behavior in ways not required by the standard. On a practical note, the container_of idiom is used in the Linux kernel and many other projects. It is easy for implementations to support on modern hardware. Various "sanitizer" systems such as Address Sanitizer, Undefined Behavior Sanitizer, Purify, Valgrind, etc., all allow this behavior. On systems with flat address spaces, and even segmented ones, various "pointer games" are common (e.g. converting to integral values and masking off low order bits to find page boundaries, etc). These techniques are so common in C code today that it is very unlikely that such idioms will cease to function on any commonly supported system now or in the future.
In fact, I found one implementation of a bounds checker that gives a different interpretation of C semantics in its paper. The quotes are from the following paper: Richard W. M. Jones and Paul H. J. Kelly. Backwards-compatible bounds checking for arrays and pointers in C programs. In Third International Workshop on Automated Debugging (editors M. Kamkarand D. Byers), volume 2 (1997), No. 009 of Linköping Electronic Articles in Computer and Information Science. Linköping University Electronic Press, Linköping, Sweden. ISSN 1401-9841, May 1997 pp. 13–26. URL http://www.ep.liu.se/ea/cis/1997/009/02/
ANSI C conveniently allows us to define an object as the fundamental unit of memory allocation. [...] Operations are permitted which manipulate pointers within objects, but pointer operations are not permitted to cross between two objects. There is no ordering defined between objects, and the programmer should never be allowed to make assumptions about how objects are arranged in memory.
Bounds checking is not blocked or weakened by the use of a cast (i.e. type coercion). Cast can properly be used to change the type of the object to which a pointer refers, but cannot be used to turn a pointer to one object into a pointer to another. A corollary is that bounds checking is not type checking: it does not prevent storage from being declared with one data structure and used with another. More subtly, note that for this reason, bounds checking in C cannot easily validate use of arrays of structs which contain arrays in turn.
Every valid pointer-valued expression in C derives its result from exactly one original storage object. If the result of the pointer calculation refers to a different object, it is invalid.
This language is quite definitive but take note the paper was published in 1997, before the DR reports above were written and responded to. The best way to interpret the bounds checking system described in the paper is as a conforming implementation of C, but not one that detects all non-strictly conforming programs. I do see similarities between this paper and A Provenance-aware Memory Object Model for C - N2676 from 2021, however, so in the future the ideas similar to the ones quoted above might be codified in the language standard.
The C memory object model study group is a treasure trove of discussions related to container_of and many other closely related problems. From their mailing list archive we have these mentions of the container_of idiom:
2.5.4 Q34 Can one move among the members of a struct using representation-pointer arithmetic and casts?
The standard is ambiguous on the interaction between the allowable pointer arithmetic (on unsigned char* representation pointers) and subobjects. For example, consider:
Example cast_struct_inter_member_1.c
#include <stdio.h>
#include <stddef.h>
typedef struct { float f; int i; } st;
int main() {
st s = {.f=1.0, .i=1};
int *pi = &(s.i);
unsigned char *pci = ((unsigned char *)pi);
unsigned char *pcf = (pci - offsetof(st,i))
+ offsetof(st,f);
float *pf = (float *)pcf;
*pf = 2.0; // is this free of undefined behaviour?
printf("s.f=%f *pf=%f s.i=%i\n",s.f,*pf,s.i);
}
This forms an unsigned char* pointer to the second member (i) of a struct, does arithmetic on that using offsetof to form an unsigned char* pointer to the first member, casts that into a pointer to the type of the first member (f), and uses that to write.
In practice we believe that this is all supported by most compilers and it is used in practice, e.g. as in the Container idiom of Chisnall et al. [ASPLOS 2015], where they discuss container macros that take a pointer to a structure member and compute a pointer to the structure as a whole. They see it heavily used by one of the example programs they studied. We are told that Intel's MPX compiler does not support the container macro idiom, while Linux, FreeBSD, and Windows all rely on it.
The standard says (6.3.2.3p7): "...When a pointer to an object is converted to a pointer to a character type, the result points to the lowest addressed byte of the object. Successive increments of the result, up to the size of the object, yield pointers to the remaining bytes of the object.". This licenses the construction of the unsigned char* pointer pci to the start of the representation of s.i (presuming that a structure member is itself an "object", which itself is ambiguous in the standard), but allows it to be used only to access the representation of s.i.
The offsetof definition in stddef.h, 7.19p3, " offsetof(type,member-designator) which expands to an integer constant expression that has type size_t, the value of which is the offset in bytes, to the structure member (designated by member-designator, from the beginning of its structure (designated by type", implies that the calculation of pcf gets the correct numerical address, but does not say that it can be used, e.g. to access the representation of s.f. As we saw in the discussion of provenance, in a post-DR260 world, the mere fact that a pointer has the correct address does not necessarily mean that it can be used to access that memory without giving rise to undefined behaviour.
Finally, if one deems pcf to be a legitimate char* pointer to the representation of s.f, then the standard says that it can be converted to a pointer to any object type if sufficiently aligned, which for float* it will be. 6.3.2.3p7: "A pointer to an object type may be converted to a pointer to a different object type. If the resulting pointer is not correctly aligned (68) for the referenced type, the behavior is undefined. Otherwise, when converted back again, the result shall compare equal to the original pointer....". But whether that pointer has the right value and is usable to access memory is left unclear.
This example should be allowed in our de facto semantics but is not clearly allowed in the ISO text.
What needs to be changed in the ISO text to clarify this?
More generally, the ISO text's use of "object" is unclear: does it refer to an allocation, or are struct members, union members, and array elements also "objects"?
Key phrase being "This example should be allowed in our de facto semantics but is not clearly allowed in the ISO text." i.e. I take this to mean the the group documenets like N2676 wish to see container_of supported.
However, in a later message:
2.2 Provenance and subobjects: container-of casts
A key question is whether one can cast from a pointer to the first member of a struct to the struct as a whole, and then use that to access other members. We discussed it previously in N2222 Q34 Can one move among the members of a struct using representation-pointer arithmetic and casts?, N2222 Q37 Are usable pointers to a struct and to its first member interconvertable?, N2013, and N2012. Some of us had thought that that was uncontroversially allowed in ISO C, by 6.7.2.1p15 ...A pointer to a structure object, suitably converted, points to its initial member..., and vice versa..., but others disagree. In practice, this seems to be common in real code, in the "container-of" idiom.
Though someone suggested that the IBM XL C/C++ compiler does not support it. Clarification from WG14 and compiler teams would be very helpful on this point.
With this, the group sums it up nicely: the idiom is widely used, but there is disagreement about what the standard says about it.
I think its strictly conforming or there's a big defect in the standard. Referring to your last example, the section on pointer arithmetic doesn't give the compiler any leeway to treat p and q differently. It isn't conditional on how the pointer value was obtained, only what object it points to.
Any interpretation that p and q could be treated differently in pointer arithmetic would require an interpretation that p and q do not point to the same object. Since since there's no implementation dependent behaviour in how you obtained p and q then that would mean they don't point to the same object on any implementation. That would in turn require that p == q be false on all implementations, and so would make all actual implementations non-conforming.
I just want to answer this bit.
int arr[5][5] = ...
int *p = &arr[0][0] + 5;
int *q = &arr[1][0];
This is not UB. It is certain that p is a pointer to an element of the array, provided only that it is within bounds. In each case it points to the 6th element of a 25 element array, and can safely be dereferenced. It can also be incremented or decremented to access other elements of the array.
See n3797 S8.3.4 for C++. The wording is different for C, but the meaning is the same. In effect arrays have a standard layout and are well-behaved with respect to pointers.
Let us suppose for a moment that this is not so. What are the implications? We know that the layout of an array int[5][5] is identical to int[25], there can be no padding, alignment or other extraneous information. We also know that once p and q have been formed and given a value, they must be identical in every respect.
The only possibility is that, if the standard says it is UB and the compiler writer implements the standard, then a sufficiently vigilant compiler might either (a) issue a diagnostic based on analysing the data values or (b) apply an optimisation which was dependent on not straying outside the bounds of sub-arrays.
Somewhat reluctantly I have to admit that (b) is at least a possibility. I am led to the rather strange observation that if you can conceal from the compiler your true intentions this code is guaranteed to produce defined behaviour, but if you do it out in the open it may not.
The following simple code segfaults under gcc 4.4.4
#include<stdio.h>
typedef struct Foo Foo;
struct Foo {
char f[25];
};
Foo foo(){
Foo f = {"Hello, World!"};
return f;
}
int main(){
printf("%s\n", foo().f);
}
Changing the final line to
Foo f = foo(); printf("%s\n", f.f);
Works fine. Both versions work when compiled with -std=c99. Am I simply invoking undefined behavior, or has something in the standard changed, which permits the code to work under C99? Why does is crash under C89?
I believe the behavior is undefined both in C89/C90 and in C99.
foo().f is an expression of array type, specifically char[25]. C99 6.3.2.1p3 says:
Except when it is the operand of the sizeof operator or the unary
& operator, or is a string literal used to initialize an array, an
expression that has type "array of type" is converted to an
expression with type "pointer to type" that points to the initial
element of the array object and is not an lvalue. If the array object
has register storage class, the behavior is undefined.
The problem in this particular case (an array that's an element of a structure returned by a function) is that there is no "array object". Function results are returned by value, so the result of calling foo() is a value of type struct Foo, and foo().f is a value (not an lvalue) of type char[25].
This is, as far as I know, the only case in C (up to C99) where you can have a non-lvalue expression of array type. I'd say that the behavior of attempting to access it is undefined by omission, likely because the authors of the standard (understandably IMHO) didn't think of this case. You're likely to see different behaviors at different optimization settings.
The new 2011 C standard patches this corner case by inventing a new storage class. N1570 (the link is to a late pre-C11 draft) says in 6.2.4p8:
A non-lvalue expression with structure or union type, where the
structure or union contains a member with array type (including,
recursively, members of all contained structures and unions) refers to
an object with automatic storage duration and temporary lifetime.
Its lifetime begins when the expression is evaluated and its initial
value is the value of the expression. Its lifetime ends when the
evaluation of the containing full expression or full declarator ends.
Any attempt to modify an object with temporary lifetime results in
undefined behavior.
So the program's behavior is well defined in C11. Until you're able to get a C11-conforming compiler, though, your best bet is probably to store the result of the function in a local object (assuming your goal is working code rather than breaking compilers):
[...]
int main(void ) {
struct Foo temp = foo();
printf("%s\n", temp.f);
}
printf is a bit funny, because it's one of those functions that takes varargs. So let's break it down by writing a helper function bar. We'll return to printf later.
(I'm using "gcc (Ubuntu 4.4.3-4ubuntu5) 4.4.3")
void bar(const char *t) {
printf("bar: %s\n", t);
}
and calling that instead:
bar(foo().f); // error: invalid use of non-lvalue array
OK, that gives an error. In C and C++, you are not allowed to pass an array by value. You can work around this limitation by putting the array inside a struct, for example void bar2(Foo f) {...}
But we're not using that workaround - we're not allowed to pass in the array by value. Now, you might think it should decay to a char*, allowing you to pass the array by reference. But decay only works if the array has an address (i.e. is an lvalue). But temporaries, such as the return values from function, live in a magic land where they don't have an address. Therefore you can't take the address & of a temporary. In short, we're not allowed to take the address of a temporary, and hence it can't decay to a pointer. We are unable to pass it by value (because it's an array), nor by reference (because it's a temporary).
I found that the following code worked:
bar(&(foo().f[0]));
but to be honest I think that's suspect. Hasn't this broken the rules I just listed?
And just to be complete, this works perfectly as it should:
Foo f = foo();
bar(f.f);
The variable f is not a temporary and hence we can (implicitly, during decay) takes its address.
printf, 32-bit versus 64-bit, and weirdness
I promised to mention printf again. According to the above, it should refuse to pass foo().f to any function (including printf). But printf is funny because it's one of those vararg functions. gcc allowed itself to pass the array by value to the printf.
When I first compiled and ran the code, it was in 64-bit mode. I didn't see confirmation of my theory until I compiled in 32-bit (-m32 to gcc). Sure enough I got a segfault, as in the original question. (I had been getting some gibberish output, but no segfault, when in 64 bits).
I implemented my own my_printf (with the vararg nonsense) which printed the actual value of the char * before trying to print the letters pointed at by the char*. I called it like so:
my_printf("%s\n", f.f);
my_printf("%s\n", foo().f);
and this is the output I got (code on ideone):
arg = 0xffc14eb3 // my_printf("%s\n", f.f); // worked fine
string = Hello, World!
arg = 0x6c6c6548 // my_printf("%s\n", foo().f); // it's about to crash!
Segmentation fault
The first pointer value 0xffc14eb3 is correct (it points to the characters "Hello, world!"), but look at the second 0x6c6c6548. That's the ASCII codes for Hell (reverse order - little endianness or something like that). It has copied the array by value into printf and the first four bytes have been interpreted as a 32-bit pointer or integer. This pointer doesn't point anywhere sensible and hence the program crashes when it attempts to access that location.
I think this is in violation of the standard, simply by virtue of the fact that we're not supposed to be allowed to copy arrays by value.
On MacOS X 10.7.2, both GCC/LLVM 4.2.1 ('i686-apple-darwin11-llvm-gcc-4.2 (GCC) 4.2.1 (Based on Apple Inc. build 5658) (LLVM build 2335.15.00)') and GCC 4.6.1 (which I built) compile the code without warnings (under -Wall -Wextra), in both 32-bit and 64-bit modes. The programs all run without crashing. This is what I'd expect; the code looks fine to me.
Maybe the problem on Ubuntu is a bug in the specific version of GCC that has since been fixed?
The following simple code segfaults under gcc 4.4.4
#include<stdio.h>
typedef struct Foo Foo;
struct Foo {
char f[25];
};
Foo foo(){
Foo f = {"Hello, World!"};
return f;
}
int main(){
printf("%s\n", foo().f);
}
Changing the final line to
Foo f = foo(); printf("%s\n", f.f);
Works fine. Both versions work when compiled with -std=c99. Am I simply invoking undefined behavior, or has something in the standard changed, which permits the code to work under C99? Why does is crash under C89?
I believe the behavior is undefined both in C89/C90 and in C99.
foo().f is an expression of array type, specifically char[25]. C99 6.3.2.1p3 says:
Except when it is the operand of the sizeof operator or the unary
& operator, or is a string literal used to initialize an array, an
expression that has type "array of type" is converted to an
expression with type "pointer to type" that points to the initial
element of the array object and is not an lvalue. If the array object
has register storage class, the behavior is undefined.
The problem in this particular case (an array that's an element of a structure returned by a function) is that there is no "array object". Function results are returned by value, so the result of calling foo() is a value of type struct Foo, and foo().f is a value (not an lvalue) of type char[25].
This is, as far as I know, the only case in C (up to C99) where you can have a non-lvalue expression of array type. I'd say that the behavior of attempting to access it is undefined by omission, likely because the authors of the standard (understandably IMHO) didn't think of this case. You're likely to see different behaviors at different optimization settings.
The new 2011 C standard patches this corner case by inventing a new storage class. N1570 (the link is to a late pre-C11 draft) says in 6.2.4p8:
A non-lvalue expression with structure or union type, where the
structure or union contains a member with array type (including,
recursively, members of all contained structures and unions) refers to
an object with automatic storage duration and temporary lifetime.
Its lifetime begins when the expression is evaluated and its initial
value is the value of the expression. Its lifetime ends when the
evaluation of the containing full expression or full declarator ends.
Any attempt to modify an object with temporary lifetime results in
undefined behavior.
So the program's behavior is well defined in C11. Until you're able to get a C11-conforming compiler, though, your best bet is probably to store the result of the function in a local object (assuming your goal is working code rather than breaking compilers):
[...]
int main(void ) {
struct Foo temp = foo();
printf("%s\n", temp.f);
}
printf is a bit funny, because it's one of those functions that takes varargs. So let's break it down by writing a helper function bar. We'll return to printf later.
(I'm using "gcc (Ubuntu 4.4.3-4ubuntu5) 4.4.3")
void bar(const char *t) {
printf("bar: %s\n", t);
}
and calling that instead:
bar(foo().f); // error: invalid use of non-lvalue array
OK, that gives an error. In C and C++, you are not allowed to pass an array by value. You can work around this limitation by putting the array inside a struct, for example void bar2(Foo f) {...}
But we're not using that workaround - we're not allowed to pass in the array by value. Now, you might think it should decay to a char*, allowing you to pass the array by reference. But decay only works if the array has an address (i.e. is an lvalue). But temporaries, such as the return values from function, live in a magic land where they don't have an address. Therefore you can't take the address & of a temporary. In short, we're not allowed to take the address of a temporary, and hence it can't decay to a pointer. We are unable to pass it by value (because it's an array), nor by reference (because it's a temporary).
I found that the following code worked:
bar(&(foo().f[0]));
but to be honest I think that's suspect. Hasn't this broken the rules I just listed?
And just to be complete, this works perfectly as it should:
Foo f = foo();
bar(f.f);
The variable f is not a temporary and hence we can (implicitly, during decay) takes its address.
printf, 32-bit versus 64-bit, and weirdness
I promised to mention printf again. According to the above, it should refuse to pass foo().f to any function (including printf). But printf is funny because it's one of those vararg functions. gcc allowed itself to pass the array by value to the printf.
When I first compiled and ran the code, it was in 64-bit mode. I didn't see confirmation of my theory until I compiled in 32-bit (-m32 to gcc). Sure enough I got a segfault, as in the original question. (I had been getting some gibberish output, but no segfault, when in 64 bits).
I implemented my own my_printf (with the vararg nonsense) which printed the actual value of the char * before trying to print the letters pointed at by the char*. I called it like so:
my_printf("%s\n", f.f);
my_printf("%s\n", foo().f);
and this is the output I got (code on ideone):
arg = 0xffc14eb3 // my_printf("%s\n", f.f); // worked fine
string = Hello, World!
arg = 0x6c6c6548 // my_printf("%s\n", foo().f); // it's about to crash!
Segmentation fault
The first pointer value 0xffc14eb3 is correct (it points to the characters "Hello, world!"), but look at the second 0x6c6c6548. That's the ASCII codes for Hell (reverse order - little endianness or something like that). It has copied the array by value into printf and the first four bytes have been interpreted as a 32-bit pointer or integer. This pointer doesn't point anywhere sensible and hence the program crashes when it attempts to access that location.
I think this is in violation of the standard, simply by virtue of the fact that we're not supposed to be allowed to copy arrays by value.
On MacOS X 10.7.2, both GCC/LLVM 4.2.1 ('i686-apple-darwin11-llvm-gcc-4.2 (GCC) 4.2.1 (Based on Apple Inc. build 5658) (LLVM build 2335.15.00)') and GCC 4.6.1 (which I built) compile the code without warnings (under -Wall -Wextra), in both 32-bit and 64-bit modes. The programs all run without crashing. This is what I'd expect; the code looks fine to me.
Maybe the problem on Ubuntu is a bug in the specific version of GCC that has since been fixed?