Aliasing struct and array the conformant way - c

In the old days of pre-ISO C, the following code would have surprized nobody:
struct Point {
double x;
double y;
double z;
};
double dist(struct Point *p1, struct Point *p2) {
double d2 = 0;
double *coord1 = &p1.x;
double *coord2 = &p2.x;
int i;
for (i=0; i<3; i++) {
double d = coord2[i] - coord1[i]; // THE problem
d2 += d * d;
return sqrt(d2);
}
At that time, we all knew that alignment of double allowed the compiler to add no padding in struct Point, and we just assumed that pointer arithmetics would do the job.
Unfortunately, this problematic line uses pointer arithmetics (p[i] being by definition *(p + i)) outside of any array which is explicitely not allowed by the standard. Draft n1570 for C11 says in 6.5.6 additive operators §8:
When an expression that has integer type is added to or subtracted from a pointerpointer, the
result has the type of the pointer operand. If the pointer operand points to an element of
an array object, and the array is large enough, the result points to an element offset from
the original element such that the difference of the subscripts of the resulting and original
array elements equals the integer expression...
As nothing is said when we have not two elements of the same array, it is unspecified by the standard and from there Undefined Behaviour (even if all common compilers are glad with it...)
Question:
As this idiom allowed to avoid code replication changing just x with y then z which is quite error prone, what could be a conformant way to browse the elements of a struct as if they were members of the same array?
Disclaimer: It obviously only applies to elements of same type, and padding can be detected with a simple static_assert as shown in that other question of mine, so padding, alignment and mixed types are not my problem here.

C does not define any way to specify that the compiler must not add padding between the named members of struct Point, but many compilers have an extension that would provide for that. If you use such an extension, or if you're just willing to assume that there will be no padding, then you could use a union with an anonymous inner struct, like so:
union Point {
struct {
double x;
double y;
double z;
};
double coords[3];
};
You can then access the coordinates by their individual names or via the coords array:
double dist(union Point *p1, union Point *p2) {
double *coord1 = p1->coords;
double *coord2 = p2->coords;
double d2 = 0;
for (int i = 0; i < 3; i++) {
double d = coord2[i] - coord1[i];
d2 += d * d;
}
return sqrt(d2);
}
int main(void) {
// Note: I don't think the inner braces are necessary, but they silence
// warnings from gcc 4.8.5:
union Point p1 = { { .x = .25, .y = 1, .z = 3 } };
union Point p2;
p2.x = 2.25;
p2.y = -1;
p2.z = 0;
printf("The distance is %lf\n", dist(&p1, &p2));
return 0;
}

This is mainly a complement to JohnBollinger's answer. Anonymous struct members do allow a clean and neat syntax, and C defines a union as a type consisting of a sequence
of members whose storage overlap (6.7.2.1 Structure and union specifiers §6). Accessing a member of a union is then specified in 6.5.2.3 Structure and union members:
3 A postfix expression followed by the . operator and an identifier designates a member of
a structure or union object. The value is that of the named member,95) and is an lvalue if
the first expression is an lvalue.
and the (non normative but informative) note 95 precises:
95) If the member used to read the contents of a union object is not the same as the member last used to
store a value in the object, the appropriate part of the object representation of the value is reinterpreted
as an object representation in the new type as described in 6.2.6 (a process sometimes called ‘‘type
punning’’). This might be a trap representation.
That means that for the current version of the standard the aliasing of a struct by an array with the help of anonymous struct member in a union is explicitly defined behaviour.

Related

Is an incomplete array type compatible with a complete array type?

Question: are int (*)[] and int (*)[1] compatible?
Demo 1:
typedef int T0;
typedef T0 T1[];
typedef T1* T2[];
T1 x1 = { 13 };
T2 x2 = { &x1 };
GCC, Clang, ICC generate no diagnostics.
MSVC generates:
<source>(21): warning C4048: different array subscripts: 'T1 (*)' and 'T0 (*)[1]'
Note: the T1 (*) is int (*)[], the T0 (*)[1] is int (*)[1].
Demo 2:
int c[][1] = {0};
int (*r)[] = c;
GCC, Clang, ICC generate no diagnostics.
MSVC generates:
<source>(30): warning C4048: different array subscripts: 'int (*)[0]' and 'int (*)[1]'
Extra: C2x, 6.7.6.2 Array declarators, 6:
For two array types to be compatible, both shall have compatible element types, and if both size specifiers are present, and are integer constant expressions, then both size specifiers shall have the same constant value. If the two array types are used in a context which requires them to be compatible, it is undefined behavior if the two size specifiers evaluate to unequal values.
Are int (*)[] and int (*)[1] compatible?
It seems that yes.
C2x, 6.2.7 Compatible type and composite type, 5:
EXAMPLE Given the following two file scope declarations:
int f(int (*)(), double (*)[3]);
int f(int (*)(char *), double (*)[]);
The resulting composite type for the function is:
int f(int (*)(char *), double (*)[3]);
and:
A composite type can be constructed from two types that are compatible; ...
Here we see that double (*)[3] is the composite type constructed from types double (*)[3] and double (*)[].
Hence, the types double (*)[3] and double (*)[] are compatible.
Hence, the types int (*)[] and int (*)[1] are compatible.
I think you want to complicate things by adding identifiers and typedefs.
typedef int T0;
T0 is equivalent to int so let's simplify and use int.
typedef T0 T1[]; ===> int (T1[]); ===> int T1[];
here, T1 is equivalent to an unspecified array type, so let's simplify and use int [].
typedef T1* T2[]; ===> int (*T2[])[];
This is an array of pointers to integer arrays.
Here, T2 is equivalent to an unspecified array type, so let's simplify and use int (*[])[].
when you write
T1 x1 = { 13 }; ===> int x1[] = { 13 }; ===> int x1[1] = { 13 };
as the type is finally completely specified by the initializator, that assigns an array length of one element.
T2 x2 = { &x1 }; ===> int (*x2[])[] = { &x1 };
where x1 is of type int *[1] (and so &x1 is of type int **[1]. This will fix the inner array dimension:
... ===> int (*x2[1])[];
and, as there's only one initializer, this will fix also the outer array dimension:
... ===> int (*x2[1])[1];
So finally x2 results in an array of one pointer to array of one int.
I'd suggest you to simplify your examples, to be able to understand incomplete type definitions yourself, by using only one incomplete type, and not three levels of them. This way you would probably be capable of solving the problem without help.

Store coordinates in a union for two different ways of access [duplicate]

In the old days of pre-ISO C, the following code would have surprized nobody:
struct Point {
double x;
double y;
double z;
};
double dist(struct Point *p1, struct Point *p2) {
double d2 = 0;
double *coord1 = &p1.x;
double *coord2 = &p2.x;
int i;
for (i=0; i<3; i++) {
double d = coord2[i] - coord1[i]; // THE problem
d2 += d * d;
return sqrt(d2);
}
At that time, we all knew that alignment of double allowed the compiler to add no padding in struct Point, and we just assumed that pointer arithmetics would do the job.
Unfortunately, this problematic line uses pointer arithmetics (p[i] being by definition *(p + i)) outside of any array which is explicitely not allowed by the standard. Draft n1570 for C11 says in 6.5.6 additive operators §8:
When an expression that has integer type is added to or subtracted from a pointerpointer, the
result has the type of the pointer operand. If the pointer operand points to an element of
an array object, and the array is large enough, the result points to an element offset from
the original element such that the difference of the subscripts of the resulting and original
array elements equals the integer expression...
As nothing is said when we have not two elements of the same array, it is unspecified by the standard and from there Undefined Behaviour (even if all common compilers are glad with it...)
Question:
As this idiom allowed to avoid code replication changing just x with y then z which is quite error prone, what could be a conformant way to browse the elements of a struct as if they were members of the same array?
Disclaimer: It obviously only applies to elements of same type, and padding can be detected with a simple static_assert as shown in that other question of mine, so padding, alignment and mixed types are not my problem here.
C does not define any way to specify that the compiler must not add padding between the named members of struct Point, but many compilers have an extension that would provide for that. If you use such an extension, or if you're just willing to assume that there will be no padding, then you could use a union with an anonymous inner struct, like so:
union Point {
struct {
double x;
double y;
double z;
};
double coords[3];
};
You can then access the coordinates by their individual names or via the coords array:
double dist(union Point *p1, union Point *p2) {
double *coord1 = p1->coords;
double *coord2 = p2->coords;
double d2 = 0;
for (int i = 0; i < 3; i++) {
double d = coord2[i] - coord1[i];
d2 += d * d;
}
return sqrt(d2);
}
int main(void) {
// Note: I don't think the inner braces are necessary, but they silence
// warnings from gcc 4.8.5:
union Point p1 = { { .x = .25, .y = 1, .z = 3 } };
union Point p2;
p2.x = 2.25;
p2.y = -1;
p2.z = 0;
printf("The distance is %lf\n", dist(&p1, &p2));
return 0;
}
This is mainly a complement to JohnBollinger's answer. Anonymous struct members do allow a clean and neat syntax, and C defines a union as a type consisting of a sequence
of members whose storage overlap (6.7.2.1 Structure and union specifiers §6). Accessing a member of a union is then specified in 6.5.2.3 Structure and union members:
3 A postfix expression followed by the . operator and an identifier designates a member of
a structure or union object. The value is that of the named member,95) and is an lvalue if
the first expression is an lvalue.
and the (non normative but informative) note 95 precises:
95) If the member used to read the contents of a union object is not the same as the member last used to
store a value in the object, the appropriate part of the object representation of the value is reinterpreted
as an object representation in the new type as described in 6.2.6 (a process sometimes called ‘‘type
punning’’). This might be a trap representation.
That means that for the current version of the standard the aliasing of a struct by an array with the help of anonymous struct member in a union is explicitly defined behaviour.

initialising structs and passing by reference

I'm fairly new to C and I am having trouble working with structs. I have the following code:
typedef struct uint8array {
uint8 len;
uint8 data[];
} uint8array;
int compare_uint8array(uint8array* arr1, uint8array* arr2) {
printf("%i %i\n data: %i, %i\n", arr1->len, arr2->len, arr1->data[0], arr2->data[0]);
if (arr1->len != arr2->len) return 1;
return 0;
}
int compuint8ArrayTest() {
printf("--compuint8ArrayTest--\n");
uint8array arr1;
arr1.len = 2;
arr1.data[0] = 3;
arr1.data[1] = 5;
uint8array arr2;
arr2.len = 4;
arr2.data[0] = 3;
arr2.data[1] = 5;
arr2.data[2] = 7;
arr2.data[3] = 1;
assert(compare_uint8array(&arr1, &arr2) != 0);
}
Now the output of this program is:
--compuint8ArrayTest--
3 4
data: 5, 3
Why are the values not what I initialized them to? What am I missing here?
In your case, uint8 data[]; is a flexible array member. You need to allocate memory to data before you can actually access it.
In your code, you're trying to access invalid memory location, causing undefined behavior.
Quoting C11, chapter §6.7.2.1 (emphasis mine)
As a special case, the last element of a structure with more than one named member may
have an incomplete array type; this is called a flexible array member. In most situations,
the flexible array member is ignored. In particular, the size of the structure is as if the
flexible array member were omitted except that it may have more trailing padding than
the omission would imply. Howev er, when a . (or ->) operator has a left operand that is
(a pointer to) a structure with a flexible array member and the right operand names that
member, it behaves as if that member were replaced with the longest array (with the same
element type) that would not make the structure larger than the object being accessed; the
offset of the array shall remain that of the flexible array member, even if this would differ
from that of the replacement array. If this array would have no elements, it behaves as if
it had one element but the behavior is undefined if any attempt is made to access that
element or to generate a pointer one past it.
A proper usage example can also be found in chapter §6.7.2.1
EXAMPLE 2 After the declaration:
struct s { int n; double d[]; };
the structure struct s has a flexible array member d. A typical way to use this is:
int m = /* some value */;
struct s *p = malloc(sizeof (struct s) + sizeof (double [m]));
and assuming that the call to malloc succeeds, the object pointed to by p behaves, for most purposes, as if
p had been declared as:
struct { int n; double d[m]; } *p;

Scale elements of a structure using a cast

I got this question/assignment on a test yesterday at university. It goes like this:
Give the following structure:
typedef struct _rect {
int width;
int height;
} rect;
How could you scale the width and height members using a cast to int* without explicitly accessing the two members?
So basically, given that struct, how could I do
rect *my_rectangle = malloc(sizeof(rect));
my_rectangle->width = 4;
my_rectangle->height = 6;
// Change this part
my_rectangle->width /= 2;
my_rectangle->height /= 2;
Using a cast to int or int*?
You can scale reliably only the first member:
*((int *) my_rectangle) /= 2;
This does not violate strict aliasing rule, as the Standard explicitely allows to convert pointer to struct object to the pointer of its first member.
C11 §6.7.2.1/15 Structure and union specifiers
Within a structure object, the non-bit-field members and the units in
which bit-fields reside have addresses that increase in the order in
which they are declared. A pointer to a structure object, suitably
converted, points to its initial member (or if that member is a
bit-field, then to the unit in which it resides), and vice versa.
There may be unnamed padding within a structure object, but not at its
beginning.
Assuming that, there is no padding between those members, the second member may be scaled as well, as long as the pointer is of the same (compatible) type as of the member, that is int.
you have starting address of structure, so you can access individual elements
by incrementing address correspondingly. Here since both types are int you can use integer pointer else better to use char pointer.
int *ptr = my_rectangle;
*(ptr) /= 2;
*(ptr+1) /=2;
What they're trying to teach you is how that struct is represented in memory. It has two int members, so it's possible that within memory it could also be viewed as if it were an array of int. So the following could possibly work.
rect *my_rectangle = malloc(sizeof(rect));
my_rectangle->width = 4;
my_rectangle->height = 6;
int *my_array=(int *) my_rectangle;
my_array[0] /= 2;
my_array[1] /= 2;
But it's a really dirty hack and it's entirely possible that a compiler could store your struct in an entirely different way, such that casting it to an int * would not have the desired effect. So not at all recommended if you want to write good clean portable code IMHO.
And if someone were to change the struct, such as by making width & height be float instead of int, the code would probably compile without any issues or warnings and then not work at all how you'd expect.
This task is rather questionable since you can easily end up with invoking poorly defined behavior.
As it happens, we can get away with it since the types of the struct are int, same as the pointer type, and there is an exception in the strict aliasing rule for this.
There's still the issue of padding though, so we would have to ensure that there's no padding present between the integers.
The obscure result is something like this:
// BAD. Don't write code like this!
#include <stddef.h>
#include <stdio.h>
typedef struct
{
int width;
int height;
} rect;
int main (void)
{
rect my_rectangle;
my_rectangle.width = 4;
my_rectangle.height = 6;
int* ptr = (int*)&my_rectangle;
*ptr /= 2;
_Static_assert(offsetof(rect, height) == sizeof(int), "Padding detected.");
ptr++;
*ptr /= 2;
printf("%d %d", my_rectangle.width, my_rectangle.height);
return 0;
}
It is much better practice to use a union instead. We would still have the same padding issue, but wouldn't have to worry about strict aliasing. And the code turns easier to read:
#include <stddef.h>
#include <stdio.h>
typedef union
{
struct
{
int width;
int height;
};
int array[2];
} rect;
int main (void)
{
rect my_rectangle;
my_rectangle.width = 4;
my_rectangle.height = 6;
_Static_assert(offsetof(rect, height) == sizeof(int), "Padding detected.");
my_rectangle.array[0] /= 2;
my_rectangle.array[1] /= 2;
printf("%d %d", my_rectangle.width, my_rectangle.height);
return 0;
}

Why does `const` qualifier gets discarded in this initialization?

I'm getting initialization discards ‘const’ qualifier from pointer target type warning in the line .grid_col = &c_ax_gd i.e. assigning an address expression to a pointer, which is part of a constant structure.
struct color {
double r;
double g;
double b;
double a;
};
struct graph {
double origin[2];
double dim[2];
double scale[2];
double grid_line_width;
double axes_line_width;
struct color *grid_col;
struct color *axes_col;
};
int main(void) {
static const struct color c_ax_gd = {0.5, 0.5, 0.5, 1};
static const double h = 600, w = 800;
const struct graph g = {
.origin = {w / 2, h / 2},
.dim = {w, h},
.scale = {w / 20, h / 20},
.grid_line_width = 0.5,
.axes_line_width = 1,
.grid_col = &c_ax_gd,
.axes_col = &c_ax_gd
};
....
}
I found the following in C99 standard
More latitude is permitted for constant expressions in initializers. Such a constant expression shall be, or evaluate to, one of the following:
an arithmetic constant expression,
a null pointer constant,
an address constant, or
an address constant for an object type plus or minus an integer constant expression.
An address constant is a null pointer, a pointer to an lvalue designating an object of static storage duration, or a pointer to a function designator; it shall be created explicitly using the unary & operator or an integer constant cast to pointer type, or implicitly by the use of an expression of array or function type
My question is, doesn't that mean &c_ax_gd is an address constant? If so, how does using an address constant inside an initializer for a constant structure discards the const qualifier?
The problem is somewhere else. Even if a struct is const, if it has pointers as members, the objects those pointers point to are not automatically const. You can see that from the following example.
struct example {
int * p;
};
int
main()
{
/* const */ int a = 1;
const struct example s = {&a};
*(s.p) = 2;
return 0;
}
Therefore, if you uncomment the /* const */ and assign the address of a to s.p, you're discarding the const qualifier on a. That's what your compiler warns you about.
g is const, which means all its members are const (including grid_col). So grid_col is a const pointer to struct color and &c_ax_gd is a pointer to const struct color.
You are tryting to initialize const struct color pointer with a struct color pointer and that's why you get the warning.
C99 6.5.16.1
both operands are pointers to qualified or unqualified versions of compatible types, and the type pointed to by the left has all the qualifiers of the type pointed to by the right
If you ignore the warning, you'll just get undefined behavior if you modify the value grid_col points to.

Resources