Allowing struct field to overflow to the next field - c

Consider the following simple example:
struct __attribute__ ((__packed__)) {
int code[1];
int place_holder[100];
} s;
void test(int n)
{
int i;
for (i = 0; i < n; i++) {
s.code[i] = 1;
}
}
The for-loop is writing to the field code, which is of size 1. The next field after code is place_holder.
I would expect that in case of n > 1, the write to code array would overflow and 1 would be written to place_holder.
However, when compiling with -O2 (on gcc 4.9.4 but probably on other versions as well) something interesting happens.
The compiler identifies that the code might overflow array code, and limits loop unrolling to 1 iteration.
It's easy to see that when compiling with -fdump-tree-all and looking at the last tree pass ("t.optimized"):
;; Function test (test, funcdef_no=0, decl_uid=1366, symbol_order=1)
Removing basic block 5
test (int n)
{
<bb 2>:
# DEBUG i => 0
# DEBUG i => 0
if (n_4(D) > 0)
goto <bb 3>;
else
goto <bb 4>;
<bb 3>:
s.code[0] = 1;
# DEBUG i => 1
# DEBUG i => 1
<bb 4>:
return;
}
So in this case the compiler completely unrolled the loop to a single iteration.
My questions are:
From C specification viewpoint, is overflowing (deliberately) from one struct member to the next is illegal or undefined behavior?
Let's assume I'm aware of the struct layout in memory and know what I'm doing when deliberately overflowing the code array.
Is there a way to prevent gcc from unrolling the loop in such case? I know I can completely prevent loop unrolling, however I'm still interested in loop unrolling on other cases. I also suspect that the analysis the compiler is doing might affect passes other than loop unrolling.
gcc is assuming I'm not going to overflow when accessing my array, so what I'm really looking for is way to tell the compiler not to take this assumption (by providing some compiler option).
I'm aware it's a bad practice to write such code that overflows from one field to another, and I'm not intending to write such code.
I'm also aware of the practice to put an array (possibly zero sized) as the last struct field to allow it to overflow, this is well supported by compilers, while in this case the array code is not the last field.
So this is not a question of "how to fix the code", but rather a question of understanding the compiler assumptions and affecting them.
These questions came up when I observed existing code that was already written in such way, and debugged it to find out why it's not behaving as the original developer expected it to behave.
The risk is that there are other places in the code where such problem exists. Static analysis tools can help to find out, but I would also like to know if there's a way to make the compiler tolerate such code and still generate the result we would expect.
Update
I got clear answer to question (1) above, but not for question (2).
Can gcc allow this as an extension, by some compile options?
Is there a way to at least get a warning when gcc identifies it? (and it clearly identifies it, by optimizing things out).
That's important in order to identify such cases in a large existing code base.

From C specification viewpoint, is overflowing (deliberately) from one struct member to the next is illegal or undefined behavior?
It is undefined behavior. The arr[i] operator is syntactic sugar around *(arr + i). So array access boils down to the binary + operator for pointer arithmetic, C17 6.5.6 additive operators, from §7 and §8:
For the purposes of these operators, a pointer to an object that is not an element of an
array behaves the same as a pointer to the first element of an array of length one with the
type of the object as its element type.
When an expression that has integer type is added to or subtracted from a pointer, the
result has the type of the pointer operand. /--/
If both the pointer
operand and the result point to elements of the same array object, or one past the last
element of the array object, the evaluation shall not produce an overflow; otherwise, the
behavior is undefined.
If the result points one past the last element of the array object, it
shall not be used as the operand of a unary * operator that is evaluated.
As you noticed, optimizing compilers might exploit these rules to produce faster code.
Is there a way to prevent gcc from unrolling the loop in such case?
There is a a special exception rule that can be used, C17 6.3.2.3/7:
When a pointer to an object is converted to a pointer to a character type,
the result points to the lowest addressed byte of the object. Successive increments of the
result, up to the size of the object, yield pointers to the remaining bytes of the object.
Also, strict aliasing does not apply to character types, because of another special rule in C17 6.5 §7
An object shall have its stored value accessed only by an lvalue expression that has one of
the following types: ... a character type.
These two special rules co-exist in harmony. So assuming we don't mess up alignment etc during the pointer conversion, this means that we are allowed to do this:
unsigned char* i;
for(i = (unsigned char*)&mystruct; i < (unsigned char*)(&mystruct + 1); i++)
{
do_something(*i);
}
This may however read padding bytes etc so it's "implementation-defined". But in theory you can access the struct byte per byte, and as long as the struct offsets are calculated on byte-per-byte basis, you can iterate across multiple members of the struct (or any other object) in this manner.
As far as I can tell, this very questionable-looking code should be well-defined:
#include <stdint.h>
#include <string.h>
#include <stdio.h>
struct __attribute__ ((__packed__)) {
int code[1];
int place_holder[100];
} s;
void test(int val, int n)
{
for (unsigned char* i = (unsigned char*)&s;
i < (unsigned char*)&s + n*sizeof(int);
i += _Alignof(int))
{
if((uintptr_t)i % _Alignof(int) == 0) // not really necessary, just defensive prog.
{
memcpy(i, &val, sizeof(int));
printf("Writing %d to address %p\n", val, (void*)i);
}
}
}
int main (void)
{
test(42, 3);
printf("%d %d %d\n", s.code[0], s.place_holder[0], s.place_holder[1]);
}
This works fine on gcc and clang (x86). How efficient it is, well that's another story. Please don't write code like this, though.

From C specification viewpoint, is overflowing (deliberately) from one struct member to the next is illegal or undefined behavior?
It's undefined behavior to access an array out-of-bounds. From C11 J.2:
The behavior is undefined in the following circumstances:
[...]
An array subscript is out of range [...]
Is there a way to prevent gcc from unrolling the loop in such case?
Alias code with a volatile pointer. But even using an intermediary pointer seems to work. godbolt link

Just _Static_assert the layout and do the pointer arithmetic in (char*), then cast to (int*) and
do the access. No further tricks such as memcpy/_Alignof are required because ints are unpadded
and you are accessing ints where there really are ints.
This alone makes gcc unroll the loop.
Why character-pointer based (char*, signed char*, unsigned char*) pointer arithmetic is required is because
http://port70.net/~nsz/c/c11/n1570.html#J.2 (non-normatively, as it is just an appendix, but gcc seems to follow it) makes out-of bounds accesses UB,
but http://port70.net/~nsz/c/c99/n1256.html#6.2.6.1p4 and http://port70.net/~nsz/c/c99/n1256.html#6.5p6 still allow inspecting any object via character pointers (more discussion on this at Is accessing an element of a multidimensional array out of bounds undefined behavior?).
Alternatively you could do the pointer arithmetic via uintptr_t (then it will be implementation defined)
but gcc optimizes those worse in certain cases (gcc doesn't fold (uintptr_t)p < (uintptr_t)(p+10) into true, but it does so for (char*)p < (char*)(p+10). This could be considered a missed optimization).
struct __attribute__ ((__packed__)) s {
int code[1];
int place_holder[100];
} s;
void test_s(int n) //original
{
int i;
for (i = 0; i < n; i++) {
s.code[i] = 1;
}
}
#include <stddef.h> //offsetof
void test_s2(int n) //unrolls the loop
{
_Static_assert(offsetof(struct s,code)+sizeof(int)==offsetof(struct s,place_holder),"");
//^will practically hold even without __attribute__((__packed__))
int i; for (i = 0; i < n; i++)
*(int*)((char*)&s.code + (size_t)i*sizeof(s.code[0])) = 1;
}
/////////////
//same code as test_s2
struct r {
int code101[101];
} r;
void test_r(int n)
{
int i;
for (i = 0; i < n; i++) {
r.code101[i] = 1;
}
}

1. Question:
"From C specification viewpoint, is overflowing (deliberately) from one struct member to the next illegal or undefined behavior?"
It is undefined behavior. The C standard states (emphasize mine):
"A postfix expression followed by an expression in square brackets [] is a subscripted designation of an element of an array object. The definition of the subscript operator [] is that E1[E2] is identical to (*((E1)+(E2))). Because of the conversion rules that apply to the binary + operator, if E1 is an array object (equivalently, a pointer to the initial element of an array object) and E2 is an integer, E1[E2] designates the E2-th element of E1 (counting from zero)."
Source: ISO/IEC 9899:2018 (C18), §6.5.2.1/2
"When an expression that has integer type is added to or subtracted from a pointer, the result has the type of the pointer operand. If the pointer operand points to an element of an array object, and the array is large enough, the result points to an element offset from the original element such that the difference of the subscripts of the resulting and original array elements equals the integer expression. In other words, if the expression P points to the i-th element of an array object, the expressions (P) + N (equivalently, N + (P)) and (P) - N (where N has the value n) point to, respectively, the i+n-th and i−n-th elements of the array object, provided they exist. Moreover, if the expression P points to the last element of an array object, the expression (P) + 1 points one past the last element of the array object, and if the expression Q points one past the last element of an array object, the expression (Q) - 1 points to the last element of the array object. If both the pointer operand and the result point to elements of the same array object, or one past the last element of the array object, the evaluation shall not produce an overflow; otherwise, the behavior is undefined. If the result points one past the last element of the array object, it shall not be used as the operand of a unary * operator that is evaluated."
Source: ISO/IEC 9899:2018 (C18), §6.5.6/8
Also non-normative Annex J states with regard to paragraph §6.5.6 in the normative standard:
J.2 Undefined behavior
1 The behavior is undefined in the following circumstances:
....
An array subscript is out of range, even if an object is apparently accessible with the given subscript (as in the lvalue expression a[1][7] given the declaration int a[4][5]) (6.5.6).
2. Question (plus update):
"Is there a way to prevent gcc from unrolling the loop in such case?"
"Can gcc allow this as an extension, by some compile options?"
"Is there a way to at least get a warning when gcc identifies it? That's important in order to identify such cases in a large existing code base."
You could try to place an empty assembly code function like asm(""); into the loop, as shown in this answer by Denilson Sá Maia, f.e.:
for (i = 0; i < n; i++) {
s.code[i] = 1;
asm("");
}
or #pragma's around the test function, as shown here, f.e.:
#pragma GCC push_options
#pragma GCC optimize ("O0")
void test(int n)
{
int i;
for (i = 0; i < n; i++) {
s.code[i] = 1;
}
}
#pragma GCC pop_options
to prevent the optimization for that specific program part in general and with that the loop unrolling.
Related:
How to prevent gcc optimizing some statements in C?
How to prevent GCC from optimizing out a busy wait loop?
Is there a way to tell GCC not to optimise a particular piece of code?
It is not preventing the loop unrolling, but you can use AddressSanitizer, which also got LeakSanitizer integrated, and is built into GCC since version 4.8 to detect when the loop unrolling doesn't work/you access non-affiliated memory.
More information about this, you can find here.
Edit: As you said your target implementation is MIPS, you can still use Valgrind to detect memory leaks.

In the language Dennis Ritchie described in 1974, the behavior of struct member access operators and pointer arithmetic were defined in terms of machine addresses, and except for the use of object size to scale pointer arithmetic, were agnostic as to the types of objects the addresses represented. The C Standard allows implementations to behave in that fashion when their customers would find it useful, but would also allow them to do other things, such as trapping out-of-bounds array accesses, if customers would find those other behaviors more useful.
Although later C dialects effectively behaved as though struct member names are prefixed by the struct name, so as to give each structure type its own member namespace, in most other respects compilers can be configured, by disabling optimizations if nothing else, to behave in a fashion consistent with Ritchie's 1974 language. Unfortunately, there's no way to distinguish implementations that will consistently behave in that fashion from those that won't; some compilers, especially those which go back to a time before the Standard, don't explicitly document that they support the 1974 behaviors because they were written at a time when compilers were generally expected to do so unless they documented otherwise.

Related

C: Reading 8 bytes from a region of size 0 [-Wstringop-overread] [duplicate]

Just curious, what actually happens if I define a zero-length array int array[0]; in code? GCC doesn't complain at all.
Sample Program
#include <stdio.h>
int main() {
int arr[0];
return 0;
}
Clarification
I'm actually trying to figure out if zero-length arrays initialised this way, instead of being pointed at like the variable length in Darhazer's comments, are optimised out or not.
This is because I have to release some code out into the wild, so I'm trying to figure out if I have to handle cases where the SIZE is defined as 0, which happens in some code with a statically defined int array[SIZE];
I was actually surprised that GCC does not complain, which led to my question. From the answers I've received, I believe the lack of a warning is largely due to supporting old code which has not been updated with the new [] syntax.
Because I was mainly wondering about the error, I am tagging Lundin's answer as correct (Nawaz's was first, but it wasn't as complete) -- the others were pointing out its actual use for tail-padded structures, while relevant, isn't exactly what I was looking for.
An array cannot have zero size.
ISO 9899:2011 6.7.6.2:
If the expression is a constant expression, it shall have a value greater than zero.
The above text is true both for a plain array (paragraph 1). For a VLA (variable length array), the behavior is undefined if the expression's value is less than or equal to zero (paragraph 5). This is normative text in the C standard. A compiler is not allowed to implement it differently.
gcc -std=c99 -pedantic gives a warning for the non-VLA case.
As per the standard, it is not allowed.
However it's been current practice in C compilers to treat those declarations as a flexible array member (FAM) declaration:
C99 6.7.2.1, §16: As a special case, the last element of a structure with more than one named member may have an incomplete array type; this is called a flexible array member.
The standard syntax of a FAM is:
struct Array {
size_t size;
int content[];
};
The idea is that you would then allocate it so:
void foo(size_t x) {
Array* array = malloc(sizeof(size_t) + x * sizeof(int));
array->size = x;
for (size_t i = 0; i != x; ++i) {
array->content[i] = 0;
}
}
You might also use it statically (gcc extension):
Array a = { 3, { 1, 2, 3 } };
This is also known as tail-padded structures (this term predates the publication of the C99 Standard) or struct hack (thanks to Joe Wreschnig for pointing it out).
However this syntax was standardized (and the effects guaranteed) only lately in C99. Before a constant size was necessary.
1 was the portable way to go, though it was rather strange.
0 was better at indicating intent, but not legal as far as the Standard was concerned and supported as an extension by some compilers (including gcc).
The tail padding practice, however, relies on the fact that storage is available (careful malloc) so is not suited to stack usage in general.
In Standard C and C++, zero-size array is not allowed..
If you're using GCC, compile it with -pedantic option. It will give warning, saying:
zero.c:3:6: warning: ISO C forbids zero-size array 'a' [-pedantic]
In case of C++, it gives similar warning.
It's totally illegal, and always has been, but a lot of compilers
neglect to signal the error. I'm not sure why you want to do this.
The one use I know of is to trigger a compile time error from a boolean:
char someCondition[ condition ];
If condition is a false, then I get a compile time error. Because
compilers do allow this, however, I've taken to using:
char someCondition[ 2 * condition - 1 ];
This gives a size of either 1 or -1, and I've never found a compiler
which would accept a size of -1.
Another use of zero-length arrays is for making variable-length object (pre-C99). Zero-length arrays are different from flexible arrays which have [] without 0.
Quoted from gcc doc:
Zero-length arrays are allowed in GNU C. They are very useful as the last element of a structure that is really a header for a variable-length object:
struct line {
int length;
char contents[0];
};
struct line *thisline = (struct line *)
malloc (sizeof (struct line) + this_length);
thisline->length = this_length;
In ISO C99, you would use a flexible array member, which is slightly different in syntax and semantics:
Flexible array members are written as contents[] without the 0.
Flexible array members have incomplete type, and so the sizeof operator may not be applied.
A real-world example is zero-length arrays of struct kdbus_item in kdbus.h (a Linux kernel module).
I'll add that there is a whole page of the online documentation of gcc on this argument.
Some quotes:
Zero-length arrays are allowed in GNU C.
In ISO C90, you would have to give contents a length of 1
and
GCC versions before 3.0 allowed zero-length arrays to be statically initialized, as if they were flexible arrays. In addition to those cases that were useful, it also allowed initializations in situations that would corrupt later data
so you could
int arr[0] = { 1 };
and boom :-)
Zero-size array declarations within structs would be useful if they were allowed, and if the semantics were such that (1) they would force alignment but otherwise not allocate any space, and (2) indexing the array would be considered defined behavior in the case where the resulting pointer would be within the same block of memory as the struct. Such behavior was never permitted by any C standard, but some older compilers allowed it before it became standard for compilers to allow incomplete array declarations with empty brackets.
The struct hack, as commonly implemented using an array of size 1, is dodgy and I don't think there's any requirement that compilers refrain from breaking it. For example, I would expect that if a compiler sees int a[1], it would be within its rights to regard a[i] as a[0]. If someone tries to work around the alignment issues of the struct hack via something like
typedef struct {
uint32_t size;
uint8_t data[4]; // Use four, to avoid having padding throw off the size of the struct
}
a compiler might get clever and assume the array size really is four:
; As written
foo = myStruct->data[i];
; As interpreted (assuming little-endian hardware)
foo = ((*(uint32_t*)myStruct->data) >> (i << 3)) & 0xFF;
Such an optimization might be reasonable, especially if myStruct->data could be loaded into a register in the same operation as myStruct->size. I know nothing in the standard that would forbid such optimization, though of course it would break any code which might expect to access stuff beyond the fourth element.
Definitely you can't have zero sized arrays by standard, but actually every most popular compiler gives you to do that. So I will try to explain why it can be bad
#include <cstdio>
int main() {
struct A {
A() {
printf("A()\n");
}
~A() {
printf("~A()\n");
}
int empty[0];
};
A vals[3];
}
I am like a human would expect such output:
A()
A()
A()
~A()
~A()
~A()
Clang prints this:
A()
~A()
GCC prints this:
A()
A()
A()
It is totally strange, so it is a good reason not to use empty arrays in C++ if you can.
Also there is extension in GNU C, which gives you to create zero length array in C, but as I understand it right, there should be at least one member in structure prior, or you will get very strange examples as above if you use C++.

Is it valid to calculate element pointers by explicit arithmetic?

Is the following program valid? (In the sense of being well-defined by the ISO C standard, not just happening to work on a particular compiler.)
struct foo {
int a, b, c;
};
int f(struct foo *p) {
// should return p->c
char *q = ((char *)p) + 2 * sizeof(int);
return *((int *)q);
}
It follows at least some of the rules for well-defined use of pointers:
The value being loaded, is of the same type that was stored at the address.
The provenance of the calculated pointer is valid, being derived from a valid pointer by adding an offset, that gives a pointer still within the original storage instance.
There is no mixing of element types within the struct, that would generate padding to make an element offset unpredictable.
But I'm still not sure it's valid to explicitly calculate and use element pointers that way.
C is a low level programming language. This code is well-defined but probably not portable.
It is not portable because it makes assumptions about the layout of the struct. In particular, you might run into fields being 64-bit aligned on a 64bit platform where in is 32 bit.
Better way of doing it is using the offsetof marco.
The C standard allows there to be arbitrary padding between elements of a struct (but not at the beginning of one). Real-world compilers won’t insert padding into a struct like that one, but the DeathStation 9000 is allowed to. If you want to do that portably, use the offsetof() macro from <stddef.h>.
*(int*)((char*)p + offsetof(foo, c))
is guaranteed to work. A difference, such as offsetof(foo,c) - offsetof(foo, b), is also well-defined. (Although, since offsetof() returns an unsigned value, it’s defined to wrap around to a large unsigned number if the difference underflows.)
In practice, of course, use &p->c.
An expression like the one in your original question is guaranteed to work for array elements, however, so long as you do not overrun your buffer. You can also generate a pointer one past the end of an array and compare that pointer to a pointer within the array, but dereferencing such a pointer is undefined behavior.
I think it likely that at least some authors of the Standard intended to allow a compiler given something like:
struct foo { unsigned char a[4], b[4]; } x;
int test(int i)
{
x.b[0] = 1;
x.a[i] = 2;
return x.b[0];
}
to generate code that would always return 1 regardless of the value of i. On the flip side, I think it is extremely like nearly all of the Committee would have intended that a function like:
struct foo { char a[4], b[4]; } x;
void put_byte(int);
void test2(unsigned char *p, int sz)
{
for (int i=0; i<sz; i++)
put_byte(p[i]);
}
be capable of outputting all of the bytes in x in a single invocation.
Clang and gcc will assume that any construct which applies the [] operator to a struct or union member will only be used to access elements of that member array, but the Standard defines the behavior of arrayLValue[index] as equivalent to (*((arrayLValue)+index)), and would define the address of x.a's first element, which is an unsigned char*, as equivalent to the address of x, cast to that type. Thus, if code calls test2((unsigned char*)x), the expression p[i] would be equivalent to x.a[i], which clang and gcc would only support for subscripts in the range 0 to 3.
The only way I see of reading the Standard as satisfying both viewpoints would be to treat support for even the latter construct as a "quality of implementation" issue outside the Standard's jurisdiction, on the assumption that quality implementations would support constructs like the latter with or without a mandate, and there was thus no need to write sufficiently detailed rules to distinguish those two scenarios.

Undefined behavior when working with partially initialized struct in C90

Let's consider the following code:
struct M {
unsigned char a;
unsigned char b;
};
void pass_by_value(struct M);
int main() {
struct M m;
m.a = 0;
pass_by_value(m);
return 0;
}
In the function pass_by_value m.b is initialized before used.
However, since m is passed by value the compiler copies it to the stack already.
No variable has storage class register here. a and b are of type unsigned char.
Does that have to be considered UB in C90? (Please note: I am specifically asking for C90)
This question is very similar to Returning a local partially initialized struct from a function and undefined behavior, but actually the other way around.
The C 1990 standard (and the C 199 standard) does not contain the sentence that first appears in C 2011 that makes the behavior of using some uninitialized values undefined.
C 2011 6.3.2.1 2 includes:
… If the lvalue has an incomplete type and does not have array type, the behavior is undefined. If the lvalue designates an object of automatic storage duration that could have been declared with the register storage class (never had its address taken), and that object is uninitialized (not declared with an initializer and no assignment to it has been performed prior to use), the behavior is undefined.
The whole of the corresponding paragraph in C 1990, clause 6.2.2.1, second paragraph, is:
Except when it is the operand of the sizeof operator, the unary & operator, the ++ operator, the -- operator, or the left operand of the . operator or an assignment operator, an lvalue that does not have array type is converted to the value stored in the designated object (and is no longer an lvalue). If the lvalue has qualified type, the value has the unqualified version of the type of the lvalue; otherwise, the value has the type of the lvalue. If the lvalue has an incomplete type and does not have array type, the behavior is undefined.
Therefore, the behavior of the code in the question would seem to be defined, inasmuch that it passes the value stored in the structure.
In the absence of explicit statements in the standard, common practice helps guide interpretation. It is perfectly normal not to initialize all members of a structure yet to expect the structure to represent useful data, and therefore the behavior of using the structure as a value must be defined if at least one of its members is initialized. The equivalent question for C 2011 contains mention (from a C defect report) of the standard struct tm in one of its answers. The struct tm may be used to represent a specific date by filling in all of date fields (year, month, day of month) and possibly the time fields (hour, minute, second, even Daylight Savings Time indication) but leaving the day of week and day of year fields uninitialized.
In defining undefined behavior in 3.16, the 1990 standard does say it is “Behavior, upon use … of indeterminately valued objects, for which this International Standard imposes no requirements.” And 6.5.7 says “… If an object that has automatic storage duration is not initialized explicitly, its value is indeterminate…” However, a structure with automatic storage duration in which one member, but not another, has been initialized is neither fully initialized nor not initialized. Given the intended uses of structures, I would say we should not consider use of the value of a partially initialized structure to be subject to being made undefined by 3.16.
Under C90, if an object held Indeterminate Value, each and every bit could independently be zero or one, regardless of whether or not they would in combination represent a valid bit pattern for the object's type. If an implementation specified the behavior of attempting to read each and every one of the 2ⁿ individual possible bit patterns an object could hold, the behavior of reading an object with Indeterminate Value would be equivalent to reading the value of an arbitrarily chosen bit pattern. If there were any bit patterns for which an implementation did not specify the effect of an attempted read, then the effects of trying to read an object that might hold such bit patterns would be likewise unspecified.
Code generation efficiency could be improved in some cases by specifying the behavior of uninitialized objects more loosely, in a way which would not otherwise be consistent with sequential program execution as specified but would nonetheless meet program requirements. For example, given something like:
struct foo { short dat[16]; } x,y,z;
void test1(int a, int b, int c, int d)
{
struct foo temp;
temp.dat[a] = 1;
temp.dat[b] = 2;
temp.dat[c] = 3;
temp.dat[d] = 4;
x=temp;
y=temp;
}
void test2(int a, int b, int c, int d)
{
test1(a,b,c,d);
z=x;
}
If client code only cares about the values of x and y that correspond to values of temp that were written, efficiency might be improved, while still meeting requirements, if the code were rewritten as:
void test1(int a, int b, int c, int d)
{
x.dat[a] = 1;
y.dat[a] = 1;
x.dat[b] = 2;
y.dat[b] = 1;
x.dat[c] = 3;
y.dat[c] = 1;
x.dat[d] = 4;
y.dat[d] = 1;
}
The fact that the original function test1 doesn't do anything to initialize temp suggests that it won't care about what is yielded by any individual attempt to read it. On the other hand, nothing within the code for test2 would imply that client code wouldn't care about whether all members of x held the same values as corresponding values of y. Thus, such an inference would more likely be dangerous there.
The C Standard makes no attempt to define behavior in situations where an optimization might yield program behavior which, although useful, would be inconsistent with sequential processing of non-optimized code. Instead, the principle that optimizations must never affect any defined behavior is taken to imply that the Standard must characterize as Undefined all actions whose behavior would be visibly affected by optimization, leaving implementor discretion the question of what aspects of behavior should or should not be defined in what circumstances. Ironically, the only time the Standard's laxity with regard to this behavior would allow more efficient code generation outside contrived scenarios would be in cases where implementations treat the behavior as at least loosely defined, and programmers are able to exploit that. If a programmer had to explicitly initialize all elements of temp to avoid having the compiler behave in completely nonsensical fashion, that would eliminate any possibility of optimizing out the unnecessary writes to unused elements of x and y.

Accessing a struct field in adjacent to a conditional expression

why there is a difference between the 2 next code segments:
struct g {
int m[100];
};
struct a {
struct g ttt[40];
struct g hhh[40];
}man;
extern int bar(int z);
//this code generate a call to memcopy.
void foo1(int idx){
bar(((idx == 5) ? man.hhh[idx+7] : man.ttt[idx+7]).m[idx+3]);
}
//this code doesn't generate a call to memcopy.
void foo2(int idx){
bar(((idx == 5) ? man.hhh[idx+7].m[idx+3] : man.ttt[idx+7].m[idx+3]));
}
In both codes segment I want to send the same field (depends on the conditional expression) to bar function. However one the first code generate a call to memcopy (when compiled with clang to powerpc arch it can be seen clearly). I wrote a little main and run the 2 functions and they gave me the same output (compiled with gcc 4.4.7).
This answer applies to C only - the question is dual-tagged but I am assuming OP is using C for reasons that will become clear later.
Here's the first expression again:
((idx == 5) ? man.hhh[idx+7] : man.ttt[idx+7]).m[idx+3]
The type of the conditional expression is struct g. However, the result of the conditional operator in C is not an lvalue. What is it then?
In C11 6.2.4p8 it's explicitly defined as a value of temporary lifetime.
In C90 the m[idx+3] is ill-formed: m is not an lvalue because the . operator only yields an lvalue if the left operand was an lvalue; and the array-pointer decay only applies to lvalues.
In C99 array-pointer decay happens to all values, but it's not explicitly stated where decayed m points.
Personally I think it's clear enough that in C99, something akin to the C11 behaviour was intended, so I would regard the code as well-defined in C99. Further discussion here. This is probably a moot point, as on all the compilers I tried, they gave the same result for -std=c99 as they did for -std=c11.
Moving forward then: In C11 (and probably C99), Snippet 1 should give the right result. Your compiler does that, but it seems that it optimizes the code poorly. It naively copies the whole value resulting from the conditional operator before indexing into it.
Testing with godbolt, I found that all versions of "x86 clang" and "PowerPC gcc 4.8" used memcpy; but "x86 gcc" was able to optimize the code.
In C++, the result of the conditional operator is an lvalue if the second and third operands were lvalues of the same type, so this problem shouldn't arise in that language.
To avoid this problem, use an alternative where the result of the conditional operator is not a struct or union value. For example you could just use Snippet 2; or either of:
bar( ((idx == 5) ? &man.hhh[idx+7] : &man.ttt[idx+7]))->m[idx+3] );
bar( ((idx == 5) ? man.hhh : man.ttt)[idx+7].m[idx+3] );

arrays that are not lvalues and sequence point restriction

In ISO C99, arrays that are not lvalues still decay to pointers, and may be subscripted, although they may not be modified or used after the next sequence point. (source)
I understand that this feature allows array indexing in cases where a function returning a structure containing an array, which is not allowed in C89 ( http://yarchive.net/comp/struct_return.html)
will you please help me understand why there is a restriction on using/modifying it after the next sequence point?
Note, the text OP quoted is from GCC documentation. Relevant text from C99 to back up that quote is:
C99 6.5.2.2/5
If an attempt is made to modify the result of a function call or to access it after the next
sequence point, the behavior is undefined.
and also from the list of changes in the Foreword:
conversion of array to pointer not limited to lvalues
I don't have the C89 text to compare, but the C99 description of array-to-pointer conversion (6.3.2.1/3) does not mention any restriction on the array being an lvalue. Also, the C99 section on subscripting (6.5.2.1/2) talks about the expression being subscripted as postfix expression, it does not mention lvalues either.
Consider this code:
struct foo
{
char buf[20];
};
struct foo foo(char const *p) { struct foo f; strcpy(f.buf, p); return f; }
int main()
{
char *hello = foo("hello").buf;
char *bye = foo("bye").buf;
// other stuff...
printf("%s\n", hello);
printf("%s\n", bye);
}
Where do the pointers hello and bye point to? The purpose of this clause is to say that the compiler does not have to keep all of the returned objects hanging around in memory somewhere in order to make those pointers remain valid indefinitely.
Instead, the hello is only valid up until the next ; in this case (or next sequence point in general). This leaves the compiler free to implement returning structs by value as a hidden pointer parameter, as Chris Torek describes in his excellent post, which can be "freed" at the end of the current statement.
NB. The C99 situation isn't quite as simple as described in Chris's post, as the following has to work:
printf("%s %s\n", foo("hello").buf, foo("bye").buf);
My install of gcc 4.8 seems to get it right though - that works with -std=c99, and segfaults with -std=c89.

Resources