C struct pointer step from first to last element - c

Providing struct test:
#include <stdio.h>
int main() {
struct {
char* one;
char* two;
char* three;
} test;
char **ptr = &test.one;
*ptr = "one";
*++ptr = "two";
*++ptr = "three";
printf ("%s\n", test.one);
printf ("%s\n", test.two);
printf ("%s\n", test.three);
}
Question: Is there a guarantee that the elements in test struct are always in consecutive memory order? (So starting with the first struct element ++ptr will always point to the next element in the test struct?)

For pointers, most certainly you'll always observe that behaviour, but the only guarantee the C language makes is that the elements are ordered in memory. There might gaps in between to optimize the alignment of the fields (for performance, specially on RISC architectures).
The right way to do this is use the offsetof macro, or make it an array.

As noted in a comment to the question:
Yes; the elements of a structure are stored in the order they are declared. What can upset calculations is that there may be gaps (padding) between elements. It won't happen when they're all the same type.
However, you should ask yourself: if I need to step through the elements sequentially, why am I not using an array? Arrays are designed for sequential access; structures are not (and writing code to access the elements of a structure sequentially is messy).
Some relevant parts of the standard are:
§6.7.2.1 Structure and union specifiers
¶6 As discussed in 6.2.5, a structure is a type consisting of a sequence of members, whose
storage is allocated in an ordered sequence, and a union is a type consisting of a sequence
of members whose storage overlap.
¶15 Within a structure object, the non-bit-field members and the units in which bit-fields
reside have addresses that increase in the order in which they are declared. A pointer to a
structure object, suitably converted, points to its initial member (or if that member is a
bit-field, then to the unit in which it resides), and vice versa. There may be unnamed
padding within a structure object, but not at its beginning.
§6.2.5 Types
¶20 …
A structure type describes a sequentially allocated nonempty set of member objects
(and, in certain circumstances, an incomplete array), each of which has an optionally
specified name and possibly distinct type.

Related

char[] size not being counted

I have the following code:
#include <stdio.h>
#include <stdint.h>
typedef struct E_s {
uint32_t a;
uint32_t b;
uint32_t c;
} E_t;
typedef struct S_s {
uint32_t data_sz;
char data[];
} S_t;
typedef struct F_s {
E_t E;
S_t S;
char data[16];
//} __attribute__((packed)) full_msg_t;
} F_t;
int main(int argc, char* argv[])
{
F_t out;
printf("sizeof(out.data) = %lu\n", sizeof(out.data));
printf("sizeof(out.E) = %lu\n", sizeof(E_t));
printf("sizeof(out.S) = %lu\n", sizeof(S_t));
printf("sizeof(out) = %lu\n", sizeof(F_t));
return EXIT_SUCCESS;
}
When I run the code, I see the following output:
sizeof(out.data) = 16
sizeof(out.E) = 12
sizeof(out.S) = 4
sizeof(out) = 32
Question: Why is the size of S_t 4 (third line of output)? I was expecting it to be 8 (uint32_t + char[]). Why is the size of char[] not included?
Furthermore, both out.data and out.S.data point to the same memory location, which caused me to dive deep and find the above observation. Any clue here will also be very helpful. I was not expecting those 2 variables to overlap.
The standard specifies that the variable part of a structure with a flexible array member (FAM) is ignored when the size is calculated:
As a special case, the last element of a structure with more than one named member may have an incomplete array type; this is called a flexible array member. In most situations, the flexible array member is ignored. In particular, the size of the structure is as if the flexible array member were omitted except that it may have more trailing padding than the omission would imply. However, when a . (or ->) operator has a left operand that is (a pointer to) a structure with a flexible array member and the right operand names that member, it behaves as if that member were replaced with the longest array (with the same element type) that would not make the structure larger than the object being accessed; the offset of the array shall remain that of the flexible array member, even if this would differ from that of the replacement array. If this array would have no elements, it behaves as if it had one element but the behavior is undefined if any attempt is made to access that element or to generate a pointer one past it.
Emphasis added
Note that the struct F_s (aka F_t) should not be accepted; that violates the constraint in §6.7.2.1 ¶3:
A structure or union shall not contain a member with incomplete or function type (hence, a structure shall not contain an instance of itself, but may contain a pointer to an instance of itself), except that the last member of a structure with more than one named member may have incomplete array type; such a structure (and any union containing, possibly recursively, a member that is such a structure) shall not be a member of a structure or an element of an array.
The compiler should reject that (or, at least, emit a diagnostic) because constraint violations require a diagnostic. Even if the compiler doesn't reject it outright, you can't actually use the FAM of the embedded S_t because the data member of F_t doesn't move — the offsets of the elements of a structure are fixed at compile time. It would, de facto, use the data element of F_t, but that isn't defined behaviour.
In this struct:
typedef struct S_s {
uint32_t data_sz;
char data[];
} S_t;
The data member is a flexible array member. Such a member does not contribute to the size of a struct as its size is not specified. This is spelled out in section 6.7.2.1p18 of the C standard:
As a special case, the last element of a structure with more than one
named member may have an incomplete array type; this is called a
flexible array member. In most situations, the flexible array member
is ignored. In particular, the size of the structure is as if the
flexible array member were omitted except that it may have more
trailing padding than the omission would imply.
So the size of S_t does not include the data member, which is why sizeof(S_t) is 4.
Such a member can only be used when memory for the struct is dynamically allocated. For example:
S_t *s = malloc(sizeof(S_t) + 10);
This allows you to access from s->data[0] to s->data[9]
This also means that you can't put a struct with a flexible array member inside of another struct or in an array, because there's no way to know exactly where the flexible array member ends.
This is spelled out in section 6.7.2.1p3:
A structure or union shall not contain a member with incomplete or
function type (hence, a structure shall not contain an instance of
itself, but may contain a pointer to an instance of itself), except
that the last member of a structure with more than one named member
may have incomplete array type; such a structure (and any union
containing, possibly recursively, a member that is such a structure)
shall not be a member of a structure or an element of an array
char data[]; is a flexible array member and it is explicitly guaranteed not to have its size counted. Because it is mainly supposed to be used as malloc(sizeof(St_t) + n), where n is the size of the data array.
As for S_t S; inside the other struct, that's invalid C since the struct containing a flexible array member must be placed at the end and in the outer-most struct and you didn't do that. So your code doesn't compile in standard C and there it isn't possible to make assumptions that out.S.data and out.data are somehow the same memory, because all of that is beyond the scope of the C language. I suppose it might be possible that GNU C offers deterministic behavior in the form of non-standard extensions, but I'm not aware of any such guarantees.
Because char[] in structure S_t is called flexible array, which is a feature introduced in the C99 standard of the C programming language.
This maybe helpful flexible-array-members-structure-c

Does a C struct hold its members in a contiguous block of memory? [duplicate]

This question already has answers here:
Struct memory layout in C
(3 answers)
Closed 3 years ago.
Let's say my code is:
typedef stuct {
int x;
double y;
char z;
} Foo;
would x, y, and z, be right next to each other in memory? Could pointer arithmetic 'iterate' over them?
My C is rusty so I can not quite get the program right to test this.
Here is my code in full.
#include <stdlib.h>
#include <stdio.h>
typedef struct {
int x;
double y;
char z;
} Foo;
int main() {
Foo *f = malloc(sizeof(Foo));
f->x = 10;
f->y = 30.0;
f->z = 'c';
// Pointer to iterate.
for(int i = 0; i == sizeof(Foo); i++) {
if (i == 0) {
printf(*(f + i));
}
else if (i == (sizeof(int) + 1)) {
printf(*(f + i));
}
else if (i ==(sizeof(int) + sizeof(double) + 1)) {
printf(*(f + i));
}
else {
continue;
}
return 0;
}
No, it is not guaranteed for struct members to be contiguous in memory.
From §6.7.2.1 point 15 in the C standard (page 115 here):
There may be unnamed padding within a structure object, but not at its beginning.
Most of the times, something like:
struct mystruct {
int a;
char b;
int c;
};
Is indeed aligned to sizeof(int), like this:
0 1 2 3 4 5 6 7 8 9 10 11
[a ][b][padding][c ]
Yes and no.
Yes, the members of a struct are allocated within a contiguous block of memory. In your example, an object of type Foo occupies sizeof (Foo) contiguous bytes of memory, and all the members are within that sequence of bytes.
But no, there is no guarantee that the members themselves are adjacent to each other. There can be padding bytes between any two members, or after the last one. The standard does guarantee that the first defined member is at offset 0, and that all the members are allocated in the order in which they're defined (which means you can sometimes save space by reordering the members).
Normally compilers use just enough padding to satisfy the alignment requirements of the member types, but the standard doesn't require that.
So you can't (directly) iterate over the members of a structure. If you want to do that, and if all the members are of the same type, use an array.
You can use the offsetof macro, defined in <stddef.h>, to determine the byte offset of (non-bitfield) member, and it can sometimes be useful to use that to build a data structure that can be used to iterate over the members of a structure. But it's tedious, and rarely more useful than simply referring to the members by name -- particularly if they have different types.
would x, y, and z, be right next to each other in memory?
No. The struct memory allocation layout is implementation dependent - there is no guarantee struct members are right next to each other. One reason is memory padding, which is
Could pointer arithmetic 'iterate' over them?
No. You can only do pointer arithmetic for pointers to the same type.
would x, y, and z, be right next to each other in memory?
They could be, but don't have to be. The placement of elements in structures is not mandated by the ISO C standard.
In general, compiler will place the elements at some offset that is "optimal" for the architecture it compiles to. So, on 32-bit CPUs, most compilers will, by default, place elements at offsets that are multiples of 4 (as that will make for most efficient access). But, most compilers also have ways to specify different placement (alignment).
So, if you have something like:
struct X {
uint8_t a;
uint32_t b;
};
Then offset of a would be 0, but offset of b would be 4 on most 32-bit compilers with default options.
Could pointer arithmetic 'iterate' over them?
Not like the code in you example. Pointer arithmetic on pointers to structures is defined to add/subtract the address with the size of the structure. So, if you have:
struct X a[2];
struct X *p = a;
then p+1 == a+1.
To "iterate" over elements you would need to cast the p to uint8_t* and then add the offset of the element to it (using offsetof standard macro), element by element.
It depends on the padding decided on by the compiler (which is influenced by the requirements and advantages on the target architecture). The C standard does guarantee that there is to be no padding before the first member of a struct, but after that, you cannot assume anything. However, if the sizeof the struct does equal the sizeof each of its constituent types, then there is no padding.
You can enforce no padding with a compiler-specific directive. On MSVC, that's:
#pragma pack(push, 1)
// your struct...
#pragma pack(pop)
GCC has __attribute__((packed)) for the equivalent effect.
There are multiple issues with trying to use pointer arithmetic in this matter.
The first issue, as has been mentioned in other answers, is that there could be padding throughout the struct throwing off your calculations.
C11 working draft 6.7.2.1 p15: (bold emphasis mine)
Within a structure object, the non-bit-field members and the units in which bit-fields
reside have addresses that increase in the order in which they are declared. A pointer to a
structure object, suitably converted, points to its initial member (or if that member is a
bit-field, then to the unit in which it resides), and vice versa. There may be unnamed
padding within a structure object, but not at its beginning.
The second issue is that pointer arithmetic is done in multiples of the size of the type being pointed to. In the case of a struct, if you add 1 to a pointer to a struct, the pointer will be pointing to an object after the struct. Using your example struct Foo:
Foo x[3];
Foo *y = x+1; // y points to the second Foo (x[1]), not the second byte of x[0]
6.5.6 p8:
When an expression that has integer type is added to or subtracted from a pointer, the result has the type of the pointer operand. If the pointer operand points to an element of an array object, and the array is large enough, the result points to an element offset from the original element such that the difference of the subscripts of the resulting and original
array elements equals the integer expression. In other words, if the expression P points to the i-th element of an array object, the expressions (P)+N (equivalently, N+(P)) and (P)-N (where N has the value n) point to, respectively, the i+n-th and i−n-th elements of the array object, provided they exist.
A third issue is that performing pointer-arithmetic such that the result points more than one past the end of the object causes undefined behavior, as does dereferencing a pointer to one element past the end of the object obtained through the pointer arithmetic. So even if you had a struct containing three ints with no padding inbetween and took a pointer to the first int and incremented it to point to the second int, dereferencing it would cause undefined behavior.
More from 6.5.6: (bold-italic emphasis mine)
Moreover, if the expression P points to the last element of an array object, the expression (P)+1 points one past the last element of the
array object, and if the expression Q points one past the last element of an array object, the expression (Q)-1 points to the last element of the array object. If both the pointer operand and the result point to elements of the same array object, or one past the last element of the array object, the evaluation shall not produce an overflow; otherwise, the behavior is undefined. If the result points one past the last element of the array object, it shall not be used as the operand of a unary * operator that is evaluated.
A fourth issue is that dereferencing a pointer to one type as another type results in undefined behavior. This attempt at type-punning is often referred to as a strict-aliasing violation. The following is an example of undefined behavior through strict-aliasing violation even though the data types are the same size (assuming 4-byte int and float) and nicely aligned:
int x = 1;
float y = *(float *)&x;
6.5 p7:
An object shall have its stored value accessed only by an lvalue expression that has one of
the following types:
a type compatible with the effective type of the object,
a qualified version of a type compatible with the effective type
of the object,
a type that is the signed or unsigned type corresponding
to the effective type of the object,
a type that is the signed or unsigned type corresponding to a
qualified version of the effective type of the object,
an aggregate or union type that includes one of the
aforementioned types among its members (including, recursively, a
member of a subaggregate or contained union), or
a character type.
Summary:
No, a C struct does not necessarily hold its members in contiguous memory, and even if it did, the pointer arithmetic you still couldn't do what you want to do with pointer arithemetic.

What is the difference between a struct of the same datatype and an array?

If I had a struct consisting only of short unsigned integers, that would be stored in contiguous memory locations of the same size; the same also applies if I have an array of short unsigned integers. How would they be any different? Is it how they are accessed? I am aware that an array is accessed by using a pointer to reference the starting value of the array while an array operator sets an offset from that memory location, does the same apply to structs or are structs accessed by using the memory location for each piece of data?
No, they need not be the same and most likely, they are not.
In case of structure members, there can be padding between the members. So, it is not guaranteed that consecutive members will reside in contiguous memory. In this case, based on implementation, pointer arithmetic using the address of the first element may or may not work be valid.
Quoting relevant parts of C11 standard, chapter §6.7.2.1/p15, Structure and union specifiers,
[..] There may be unnamed
padding within a structure object, but not at its beginning.
and
chapter §6.5.3.4, sizeof operator,
[...] When
applied to an operand that has structure or union type, the result is the total number of
bytes in such an object, including internal and trailing padding.
In case of an array, however, all members are guaranteed to reside in contiguous memory and pointer arithmetic is deterministic.

Pointer difference across members of a struct?

The C99 standard states that:
When two pointers are subtracted, both shall point to elements of the same array object, or one past the last element of the array object
Consider the following code:
struct test {
int x[5];
char something;
short y[5];
};
...
struct test s = { ... };
char *p = (char *) s.x;
char *q = (char *) s.y;
printf("%td\n", q - p);
This obviously breaks the above rule, since the p and q pointers are pointing to different "array objects", and, according to the rule, the q - p difference is undefined.
But in practice, why should such a thing ever result in undefined behaviour? After all, the struct members are laid out sequentially (just as array elements are), with any potential padding between the members. True, the amount of padding will vary across implementations and that would affect the outcome of the calculations, but why should that result be "undefined"?
My question is, can we suppose that the standard is just "ignorant" of this issue, or is there a good reason for not broadening this rule? Couldn't the above rule be rephrased to "both shall point to elements of the same array object or members of the same struct"?
My only suspicion are segmented memory architectures where the members might end up in different segments. Is that the case?
I also suspect that this is the reason why GCC defines its own __builtin_offsetof, in order to have a "standards compliant" definition of the offsetof macro.
EDIT:
As already pointed out, arithmetic on void pointers is not allowed by the standard. It is a GNU extension that fires a warning only when GCC is passed -std=c99 -pedantic. I'm replacing the void * pointers with char * pointers.
Subtraction and relational operators (on type char*) between addresses of member of the same struct are well defined.
Any object can be treated as an array of unsigned char.
Quoting N1570 6.2.6.1 paragraph 4:
Values stored in non-bit-field objects of any other object type
consist of n × CHAR_BIT bits, where n is the size of an object of that
type, in bytes. The value may be copied into an object of type
unsigned char [ n ] (e.g., by memcpy); the resulting set of bytes is
called the object representation of the value.
...
My only suspicion are segmented memory architectures where the members
might end up in different segments. Is that the case?
No. For a system with a segmented memory architecture, normally the compiler will impose a restriction that each object must fit into a single segment. Or it can permit objects that occupy multiple segments, but it still has to ensure that pointer arithmetic and comparisons work correctly.
Pointer arithmetic requires that the two pointers being added or subtracted to be part of the same object because it doesn't make sense otherwise.
The quoted section of standard specifically refers to two unrelated objects such as int a[b]; and int b[5]. The pointer arithmetic requires to know the type of the object that the pointers pointing to (I am sure you are aware of this already).
i.e.
int a[5];
int *p = &a[1]+1;
Here p is calculated by knowing the that the &a[1] refers to an int object and hence incremented to 4 bytes (assuming sizeof(int) is 4).
Coming to the struct example, I don't think it can possibly be defined in a way to make pointer arithmetic between struct members legal.
Let's take the example,
struct test {
int x[5];
char something;
short y[5];
};
Pointer arithmatic is not allowed with void pointers by C standard (Compiling with gcc -Wall -pedantic test.c would catch that). I think you are using gcc which assumes void* is similar to char* and allows it.
So,
printf("%zu\n", q - p);
is equivalent to
printf("%zu", (char*)q - (char*)p);
as pointer arithmetic is well defined if the pointers point to within the same object and are character pointers (char* or unsigned char*).
Using correct types, it would be:
struct test s = { ... };
int *p = s.x;
short *q = s.y;
printf("%td\n", q - p);
Now, how can q-p be performed? based on sizeof(int) or sizeof(short) ? How can the size of char something; that's in the middle of these two arrays be calculated?
That should explain it's not possible to perform pointer arithmetic on objects of different types.
Even if all members are of same type (thus no type issue as stated above), then it's better to use the standard macro offsetof (from <stddef.h>) to get the difference between struct members which has the similar effect as pointer arithmetic between members:
printf("%zu\n", offsetof(struct test, y) - offsetof(struct test, x));
So I see no necessity to define pointer arithmetic between struct members by the C standard.
Yes, you are allowed to perform pointer arithmetric on structure bytes:
N1570 - 6.3.2.3 Pointers p7:
... When a pointer to an object is converted to a pointer to a character type,
the result points to the lowest addressed byte of the object. Successive increments of the
result, up to the size of the object, yield pointers to the remaining bytes of the object.
This means that for the programmer, bytes of the stucture shall be seen as a continuous area, regardless how it may have been implemented in the hardware.
Not with void* pointers though, that is non-standard compiler extension. As mentioned on paragraph from the standard, it applies only to character type pointers.
Edit:
As mafso pointed out in comments, above is only true as long as type of substraction result ptrdiff_t, has enough range for the result. Since range of size_t can be larger than ptrdiff_t, and if structure is big enough, it's possible that addresses are too far apart.
Because of this it's preferable to use offsetof macro on structure members and calculate result from those.
I believe the answer to this question is simpler than it appears, the OP asks:
but why should that result be "undefined"?
Well, let's see that the definition of undefined behavior is in the draft C99 standard section 3.4.3:
behavior, upon use of a nonportable or erroneous program construct or
of erroneous data, for which this International Standard imposes no
requirements
it is simply behavior for which the standard does not impose a requirement, which perfectly fits this situation, the results are going to vary depending on the architecture and attempting to specify the results would have probably been difficult if not impossible in a portable manner. This leaves the question, why would they choose undefined behavior as opposed to let's say implementation of unspecified behavior?
Most likely it was made undefined behavior to limit the number of ways an invalid pointer could be created, this is consistent with the fact that we are provided with offsetof to remove the one potential need for pointer subtraction of unrelated objects.
Although the standard does not really define the term invalid pointer, we get a good description in Rationale for International Standard—Programming Languages—C which in section 6.3.2.3 Pointers says (emphasis mine):
Implicit in the Standard is the notion of invalid pointers. In
discussing pointers, the Standard typically refers to “a pointer to an
object” or “a pointer to a function” or “a null pointer.” A special
case in address arithmetic allows for a pointer to just past the end
of an array. Any other pointer is invalid.
The C99 rationale further adds:
Regardless how an invalid pointer is created, any use of it yields
undefined behavior. Even assignment, comparison with a null pointer
constant, or comparison with itself, might on some systems result in
an exception.
This strongly suggests to us that a pointer to padding would be an invalid pointer, although it is difficult to prove that padding is not an object, the definition of object says:
region of data storage in the execution environment, the contents of
which can represent values
and notes:
When referenced, an object may be interpreted as having a particular
type; see 6.3.2.1.
I don't see how we can reason about the type or the value of padding between elements of a struct and therefore they are not objects or at least is strongly indicates padding is not meant to be considered an object.
I should point out the following:
from the C99 standard, section 6.7.2.1:
Within a structure object, the non-bit-field members and the units in which bit-fields
reside have addresses that increase in the order in which they are declared. A pointer to a
structure object, suitably converted, points to its initial member (or if that member is a
bit-field, then to the unit in which it resides), and vice versa. There may be unnamed
padding within a structure object, but not at its beginning.
It isn't so much that the result of pointer subtraction between members is undefined so much as it is unreliable (i.e. not guaranteed to be the same between different instances of the same struct type when the same arithmetic is applied).

What type punning/pointer magic IS defined by the standard?

I can't seem to wrap my head around certain parts of the C standard, so I'm coming here to clear up that foggy, anxious uncertainty that comes when I have to think about what such tricks are defined behaviour and what are undefined or violate the standard. I don't care whether or not it will WORK, I care if the C standard considers it legal, defined behaviour.
Such as this, which I am fairly certain is UB:
struct One
{
int Hurr;
char Durr[2];
float Nrrr;
} One;
struct Two
{
int Hurr;
char Durr[2];
float Nrrr;
double Wibble;
} Two;
One = *(struct One*)&Two;
This is not all I am talking about. Such as casting the pointer to One to int*, and dereferencing it, etc. I want to get a good understanding of what such things are defined so I can sleep at night. Cite places in the standard if you can, but be sure to specify whether it's C89 or C99. C11 is too new to be trusted with such questions IMHO.
I think that technically that example is UB, too. But it will almost certainly work, and neither gcc nor clang complain about it with -pedantic.
To start with, the following is well-defined in C99 (§6.5.2.3/6): [1]
union OneTwo {
struct One one;
struct Two two;
};
OneTwo tmp = {.two = {3, {'a', 'b'}, 3.14f, 3.14159} };
One one = tmp.one;
The fact that accessing the "punned" struct One through union must work implies that the layout of the prefix of struct Two is identical to struct One. This cannot be contingent on the existence of a union because the a given composite type can only have one storage layout, and its layout cannot be contingent on its use in a union because the union does not need to be visible to every translation unit in which the struct is used.
Furthermore, in C all types are no more than a sequence of bytes (unlike, for example, C++) (§6.2.6.1/4) [2]. Consequently, the following is also guaranteed to work:
struct One one;
struct Two two = ...;
unsigned char tmp[sizeof one];
memcpy(tmp, two, sizeof one);
memcpy(one, tmp, sizeof one);
Given the above and the convertibility of any pointer type to a void*, I think it is reasonable to conclude that the temporary storage above is unnecessary, and it could have been written directly as:
struct One one;
struct Two two = ...;
unsigned char tmp[sizeof one];
memcpy(one, two, sizeof one);
From there to the direct assignment through an aliased pointer as in the OP is not a very big leap, but there is an additional problem for the aliased pointer: it is theoretically possible for the pointer conversion to create an invalid pointer, because it's possible that the bit format of a struct Two* differs from a struct One*. Although it is legal to cast one pointer type to another pointer type with looser alignment (§6.3.2.3/7) [3] and then convert it back again, it is not guaranteed that the converted pointer is actually usable, unless the conversion is to a character type. In particular, it is possible that the alignment of struct Two is different from (more strict than) the alignment of struct One, and that the bit format of the more strongly-aligned pointer is not directly usable as a pointer to the less strongly-aligned struct. However, it is hard to see an argument against the almost equivalent:
one = *(struct One*)(void*)&two;
although this may not be explicitly guaranteed by the standard.
In comments, various people have raised the spectre of aliasing optimizations. The above discussion does not touch on aliasing at all because I believe that it is irrelevant to a simple assignment. The assignment must be sequenced after any preceding expressions and before any succeeding ones; it clearly modifies one and almost as clearly references two. An optimization which made a preceding legal mutation of two invisible to the assignment, would be highly suspect.
But aliasing optimizations are, in general, possible. Consequently, even though all of the above pointer casts should be acceptable in the context of a single assignment expression, it would certainly not be legal behaviour to retain the converted pointer of type struct One* which actually points into an object of type struct Two and expect it to be usable either to mutate a member of its target or to access a member of its target which has otherwise been mutated. The only context in which you could get away with using a pointer to struct One as though it were a pointer to the prefix of struct Two is when the two objects are overlaid in a union.
--- Standard references:
[1] "if a union contains several structures that share a common initial sequence (see below), and if the union object currently contains one of these structures, it is permitted to inspect the common initial part of any of them anywhere that a declaration of the completed type of the union is visible."
[2] "Values stored in non-bit-field objects of any other object type consist of n × CHAR_BIT
bits, where n is the size of an object of that type, in bytes. The value may be copied into
an object of type unsigned char [n] (e.g., by memcpy)…"
[3] "A pointer to an object type may be converted to a pointer to a different object type… When a pointer to an object is converted to a pointer to a character type, the result points to the lowest addressed byte of the object. Successive increments of the result, up to the size of the object, yield pointers to the remaining bytes of the object."
C99 6.7.2.1 says:
Para 5
As discussed in 6.2.5, a structure is a type consisting of a sequence
of members, whose storage is allocated in an ordered sequence
Para 12
Each non-bit-field member of a structure or union object is aligned in
an implementation-defined manner appropriate to its type.
Para 13
Within a structure object, the non-bit-field members and the units in
which bit-fields reside have addresses that increase in the order in
which they are declared. A pointer to a structure object, suitably
converted, points to its initial member (or if that member is a
bit-field, then to the unit in which it resides), and vice versa. There
may be unnamed padding within a structure object, but not at its
beginning
That last paragraph covers your second question (casting the pointer to One to int*, and dereferencing it).
The first point - whether it is valid to "Downcast" a Two* to a One* - I could not find specifically addressed. It boils down to whether the other rules ensure that the memory layout of the fields of One and the initial fields of Two are identical in all cases.
The members have to be packed in ordered sequence, no padding is allowed at the beginning, and they have to be aligned according to type, but the standard does not actually say that the layout needs to be the same (even though in most compilers I am sure it is).
There is, however, a better way to define these structures so that you can guarantee it:
struct One
{
int Hurr;
char Durr[2];
float Nrrr;
} One;
struct Two
{
struct One one;
double Wibble;
} Two;
You might think you can now safely cast a Two* to a One* - Para 13 says so. However strict aliasing might bite you somewhere unpleasant. But with the example above you don't need to anyway:
One = Two.one;
A1. Undefined behaviour, because of Wibble.
A2. Defined.
S9.2 in N3337.
Two standard-layout struct (Clause 9) types are layout-compatible if
they have the same number of non-static data members and corresponding
non-static data members (in declaration order) have layout-compatible
types
Your structs would be layout compatible and thus interchangeable but for Wibble. There is a good reason too: Wibble might cause different padding in struct Two.
A pointer to a standard-layout struct object, suitably converted using
a reinterpret_cast, points to its initial member (or if that member is
a bit-field, then to the unit in which it resides) and vice versa.
I think that guarantees that you can dereference the initial int.

Resources