sizeof anonymous nested struct - c

Suppose I have structure I'm using to model various packet formats:
#define MaxPacket 20
typedef struct {
u8 packetLength;
union {
u8 bytes[MaxPacket];
struct {
u16 field1;
u16 field2;
u16 field3;
} format1;
struct {
double value1;
double value2;
} format2;
};
} Packet;
I can expect that sizeof(Packet) will be 21. But is there any way to do something like:
sizeof(Packet.format2)
? I've tried that, but the compiler is not happy. Obviously, I could pull the format1 out as a separate typedef and then I could sizeof(format1). But I'm curious if I have to through all of that. I like the hierarchical composition of the formats. This is with gcc on an 8bit processor.
I'm equally interested if there's a way to use the nested type. IF I have to do a lot of
aPacketPointer->format2.value1; // not so onerous, but if the nesting gets deeper...
Then sometimes it would be nice to do:
Packet.format2 *formatPtr = &aPacketPointer->format2;
formatPtr->value2; // etc
Again, refactoring into a bunch of preceding typedefs would solve this problem, but then I lose the nice namespacing effect of the nested dotted references.

For something that will work even in C90, you can use a macro modeled on your toolchain's offsetof() macro:
#define sizeof_field(s,m) (sizeof((((s*)0)->m)))
Adjust it accordingly if your toolchain's offsetof() macro isn't based on casting 0 to a pointer to the structure's type.
When I use it like so:
std::cout << sizeof_field(Packet,format1) << std::endl;
std::cout << sizeof_field(Packet,format2) << std::endl;
I get the output:
6
16
For your second question, if you're willing to rely on GCC's typeof extension you can create a similar macro for declaring pointers to your nested anonymous structs:
#define typeof_field(s,m) typeof(((s*)0)->m)
...
typeof_field(Packet,format2)* f2 = &foo.format2;
To be honest, I find that construct pretty ugly, but it might still be better than other options you have available.
GCC documents that the "operand of typeof is evaluated for its side effects if and only if it is an expression of variably modified type or the name of such a type", so the apparent null pointer deference should not result in undefined behavior when a variable length array is not involved.

Using C11 or C99, create a dummy compound literal and seek its size.
printf("%zu\n", sizeof( ((Packet){ 0, { "" }}).format2 ));
Output
16

You can just give those nested structs a name, no need for a typedef. Like this:
typedef struct {
u8 packetLength;
union {
u8 bytes[MaxPacket];
struct myformat1 {
u16 field1;
u16 field2;
u16 field3;
} format1;
struct myformat2 {
double value1;
double value2;
} format2;
};
} Packet;
Then you can write e.g. sizeof(struct myformat1), declare variables of that type, etc.
You could also add a typedef afterwards, e.g.
typedef struct myformat1 myformat1;

Related

Use of macros in array definition in C

I am new to C and using it to program a Nordic nrf52 chip. I believe my problem is a general C one though rather than application.
I am setting up an array of structs using macros predefined in the chip SDK. Using those macros in the array initialisation works, but doing element by element does not.
So, the following works:
nrf_twi_mngr_transfer_t transfers_1[2] = { \
NRF_TWI_MNGR_WRITE(MSBARO5X_0_ADDR , &reg_addr[1], 1, NRF_TWI_MNGR_NO_STOP), \
NRF_TWI_MNGR_READ (MSBARO5X_0_ADDR , &p_buffer[0], sizeof(p_buffer), 0)
};
Where:
typedef struct {
uint8_t * p_data; ///< Pointer to the buffer holding the data.
uint8_t length; ///< Number of bytes to transfer.
uint8_t operation; ///< Device address combined with transfer direction.
uint8_t flags; ///< Transfer flags (see #ref NRF_TWI_MNGR_NO_STOP).
} nrf_twi_mngr_transfer_t;
NRF_TWI_WRITE and _READ are macros that use further macros, for example:
#define NRF_TWI_MNGR_WRITE(address, p_data, length, flags) \
NRF_TWI_MNGR_TRANSFER(NRF_TWI_MNGR_WRITE_OP(address), p_data, length, flags)
which uses
#define NRF_TWI_MNGR_WRITE_OP(address) (((address) << 1) | 0)
and
#define NRF_TWI_MNGR_TRANSFER(_operation, _p_data, _length, _flags) \
{ \
.p_data = (uint8_t *)(_p_data), \
.length = _length, \
.operation = _operation, \
.flags = _flags \
}
What I want to do is change individual items in this array, for example:
transfers_1[0] = NRF_TWI_MNGR_WRITE(MSBARO5X_0_ADDR , &reg_addr[1], 1, NRF_TWI_MNGR_NO_STOP);
However when I do that, I get the error "expected an expression".
MSBARO5X_0_ADDR is also defined in a define statement:
#define MSBARO5X_0_ADDR 0x76
If I replace this in any of the above code with a variable, I get the same "expected an expression" error. I suspect the two problems I have are due to the same lack of understanding on my part. SO forgive me for combining the two in a single post.
So the questions are:
-Why am I getting this error?
-Is it possible to change individual items in my array, and if so how?
-Is it possible to use a variable in place of the MSBARO5X_ADDR, and if so how?
Many thanks!
Ultimately, the macro expands into a brace enclosed initializer. Such a thing is not an expression, so it cannot be used as the right hand side of plain assignment (assignment and initialization are different things). It will work as part of a larger initializer, but not the way you try to use it unmodified.
But all is not lost. The syntax of the initializer implies c99 support. So we can use a trick. Structure objects can be assigned to eachother. So we need only obtain an object from somewhere. We can use a compound literal in order to create said object:
transfers_1[0] = (nrf_twi_mngr_transfer_t)NRF_TWI_MNGR_WRITE(/*Your arguments*/);
If you define the value of a structure the moment you declare it, the compiler will infer the type of the structure from the declaration. So this here will compile:
struct coordinates {
int x;
int y;
};
struct coordinates origin = { 10, 20 }; // This is OK
But if you assign a value to a previously declared variable, the compiler cannot infer its type. This code won't compile:
struct coordinates origin;
origin = { 10, 20 }; // ERROR! The type of the rvalue is unknown!
The type is unknown, because two structures are not equivalent in C just because they have the same members. E.g. this is legal in C:
struct coordinates {
int x;
int y;
};
struct dayOfYear {
int day;
int month;
};
Now what would { 5, 8 } be? The coordinates (5/8) or the 5th of August? It could be both. All that he compiler knows is that it is a struct of type { int, int }. Yet this does not define a type in C. The following is possible in some languages but it's not possible in C:
struct dayOfYear date = { 2, 3 };
struct coordinates cords = date; // ERROR!
Despite the fact that both structures are of type { int, int }, for the compiler struct dayOfYear and struct coordinates are two completely distinct and unrelated data types.
If you want to declare a hardcoded struct value, you need to tell the compiler what kind of struct that is:
struct coordinates origin;
origin = (struct coordinates){ 10, 20 }; // This is OK
Your NRF_TWI_MNGR_TRANSFER defines a hardcoded struct but only when you use that in a definition the compiler knows the type. If you try to use it as an assignment, you need to cast to the correct type.
transfers_1[0] = (nrf_twi_mngr_transfer_t)NRF_TWI_MNGR_WRITE(MSBARO5X_0_ADDR , &reg_addr[1], 1, NRF_TWI_MNGR_NO_STOP);
Which is not really a cast, even though it has the same syntax. In fact this is just telling the compiler how to interpret the following data.

How to generate warning if a structure is declared without "__attribute__(align(8))"

I want the compiler to generate warning for me if structure declared without __attribute__(align(8)) .
For example, if a structure is declared like this:
struct my_float {
float number;
} __attribute__((aligned(8)));
It will not generate any warning. But if I declare another struct like this:
struct my_float {
float number;
};
the compiler will generate a warning for me.
My working enveronment is linux/GCC.
I don't think you can automatically check this an ALL your structure, but you still can check your alignment manually with something like:
// x % 16 <=> x & (16-1) (avoid using modulo operator)
#define MODULO_16_MASK 0xFU
ASSERT_COMPILE((sizeof(my_float) & MODULO_16_MASK) == 0);
This should trigger a warning at compiling if your structure is not aligned.
From experience, it is not possible to do.
This attribute specifies a minimum alignment (in bytes) for variables of the specified type.
struct S { short f[3]; } __attribute__ ((aligned (8)));
typedef int more_aligned_int __attribute__ ((aligned (8)));
force the compiler that each variable whose type is struct S or more_aligned_int is allocated and aligned at least on a 8-byte boundary.

When are anonymous structs and unions useful in C11?

C11 adds, among other things, 'Anonymous Structs and Unions'.
I poked around but could not find a clear explanation of when anonymous structs and unions would be useful. I ask because I don't completely understand what they are. I get that they are structs or unions without the name afterwards, but I have always (had to?) treat that as an error so I can only conceive a use for named structs.
Anonymous union inside structures are very useful in practice. Consider that you want to implement a discriminated sum type (or tagged union), an aggregate with a boolean and either a float or a char* (i.e. a string), depending upon the boolean flag. With C11 you should be able to code
typedef struct {
bool is_float;
union {
float f;
char* s;
};
} mychoice_t;
double as_float(mychoice_t* ch)
{
if (ch->is_float) return ch->f;
else return atof(ch->s);
}
With C99, you'll have to name the union, and code ch->u.f and ch->u.s which is less readable and more verbose.
Another way to implement some tagged union type is to use casts. The Ocaml runtime gives a lot of examples.
The SBCL implementation of Common Lisp does use some union to implement tagged union types. And GNU make also uses them.
A typical and real world use of anonymous structs and unions are to provide an alternative view to data. For example when implementing a 3D point type:
typedef struct {
union{
struct{
double x;
double y;
double z;
};
double raw[3];
};
}vec3d_t;
vec3d_t v;
v.x = 4.0;
v.raw[1] = 3.0; // Equivalent to v.y = 3.0
v.z = 2.0;
This is useful if you interface to code that expects a 3D vector as a pointer to three doubles. Instead of doing f(&v.x) which is ugly, you can do f(v.raw) which makes your intent clear.
struct bla {
struct { int a; int b; };
int c;
};
the type struct bla has a member of a C11 anonymous structure type.
struct { int a; int b; } has no tag and the object has no name: it is an anonymous structure type.
You can access the members of the anonymous structure this way:
struct bla myobject;
myobject.a = 1; // a is a member of the anonymous structure inside struct bla
myobject.b = 2; // same for b
myobject.c = 3; // c is a member of the structure struct bla
Another useful implementation is when you are dealing with rgba colors, since you might want access each color on its own or as a single int.
typedef struct {
union{
struct {uint8_t a, b, g, r;};
uint32_t val;
};
}Color;
Now you can access the individual rgba values or the entire value, with its highest byte being r. i.e:
int main(void)
{
Color x;
x.r = 0x11;
x.g = 0xAA;
x.b = 0xCC;
x.a = 0xFF;
printf("%X\n", x.val);
return 0;
}
Prints 11AACCFF
I'm not sure why C11 allows anonymous structures inside structures. But Linux uses it with a certain language extension:
/**
* struct blk_mq_ctx - State for a software queue facing the submitting CPUs
*/
struct blk_mq_ctx {
struct {
spinlock_t lock;
struct list_head rq_lists[HCTX_MAX_TYPES];
} ____cacheline_aligned_in_smp;
/* ... other fields without explicit alignment annotations ... */
} ____cacheline_aligned_in_smp;
I'm not sure if that example strictly necessary, except to make the intent clear.
EDIT: I found another similar pattern which is more clear-cut. The anonymous struct feature is used with this attribute:
#if defined(RANDSTRUCT_PLUGIN) && !defined(__CHECKER__)
#define __randomize_layout __attribute__((randomize_layout))
#define __no_randomize_layout __attribute__((no_randomize_layout))
/* This anon struct can add padding, so only enable it under randstruct. */
#define randomized_struct_fields_start struct {
#define randomized_struct_fields_end } __randomize_layout;
#endif
I.e. a language extension / compiler plugin to randomize field order (ASLR-style exploit "hardening"):
struct kiocb {
struct file *ki_filp;
/* The 'ki_filp' pointer is shared in a union for aio */
randomized_struct_fields_start
loff_t ki_pos;
void (*ki_complete)(struct kiocb *iocb, long ret, long ret2);
void *private;
int ki_flags;
u16 ki_hint;
u16 ki_ioprio; /* See linux/ioprio.h */
unsigned int ki_cookie; /* for ->iopoll */
randomized_struct_fields_end
};
Well, if you declare variables from that struct only once in your code, why does it need a name?
struct {
int a;
struct {
int b;
int c;
} d;
} e,f;
And you can now write things like e.a,f.d.b,etc.
(I added the inner struct, because I think that this is one of the most usages of anonymous structs)

Compact access to variables in nested structures

Given this simple C code:
struct {
struct a {
int foo;
};
struct b {
char *bar;
};
} s;
I am wondering whether there is a way to access a variable in one of the nested structures in a more compact way than s.a.foo = 5, for instance.
First, notice that your example is not standard C89 (but it is acceptable by some compilers when you ask for some language extensions. With GCC you'll need to extend the accepted C dialect with the -fms-extensions flag to the compiler). You are using unnamed fields. A more standard way of coding would be:
struct a {
int foo;
};
struct b {
char* bar;
};
struct {
struct a aa;
struct b bb;
} s;
Back to your question, no, there is no other way. However, you might use preprocessor macros, whcih could help. For instance, assuming the above declarations, you could
#define afoo aa.foo
#define bbar bb.bar
and then you can code s.afoo instead of s.aa.foo
You might also define macros like
#define AFOO(X) (X).aa.foo
and then code AFOO(s)
Using such preprocessor macros does have some annoyance: with my example, you cannot declare anymore a variable (or formal argument, or field, or function) named afoo
But I am not sure you should bother. My personal advice & habit is to give longer and often unique names to fields (and also to name struct a_st my struct-ures). Take advantage of the auto-completion abilities of your editor. Don't forget that your code is more often read than written, so use meaningful names in it.
There is not. You have to specify the path the the memory address you wish to reference.
You can't cast structs directly, but you can cast pointers to structs. So if you have this stuct:
typedef struct {
struct {
int foo;
} a;
struct {
char bar;
} b;
} s;
You can create a struct like this:
typedef struct {
int foo;
char bar;
} sa;
Now when you create the struct, stash a pointer to it:
s myS;
myS.a.foo = 123;
myS.b.bar = 10;
sa *mySA = (sa *)&myS;
Then you can do this:
printf("I'm really a s.a.bar %d", (*mySA).bar);
Which will print out the appropriate value.
So now you can do:
(*mySA).bar = 22;
printf("%d", myS.b.bar);
You aren't really saving that much typing though.

Using unions to simplify casts

I realize that what I am trying to do isn't safe. But I am just doing some testing and image processing so my focus here is on speed.
Right now this code gives me the corresponding bytes for a 32-bit pixel value type.
struct Pixel {
unsigned char b,g,r,a;
};
I wanted to check if I have a pixel that is under a certain value (e.g. r, g, b <= 0x10). I figured I wanted to just conditional-test the bit-and of the bits of the pixel with 0x00E0E0E0 (I could have wrong endianness here) to get the dark pixels.
Rather than using this ugly mess (*((uint32_t*)&pixel)) to get the 32-bit unsigned int value, i figured there should be a way for me to set it up so I can just use pixel.i, while keeping the ability to reference the green byte using pixel.g.
Can I do this? This won't work:
struct Pixel {
unsigned char b,g,r,a;
};
union Pixel_u {
Pixel p;
uint32_t bits;
};
I would need to edit my existing code to say pixel.p.g to get the green color byte. Same happens if I do this:
union Pixel {
unsigned char c[4];
uint32_t bits;
};
This would work too but I still need to change everything to index into c, which is a bit ugly but I can make it work with a macro if i really needed to.
(Edited) Both gcc and MSVC allow 'anonymous' structs/unions, which might solve your problem. For example:
union Pixel {
struct {unsigned char b,g,r,a;};
uint32_t bits; // use 'unsigned' for MSVC
}
foo.b = 1;
foo.g = 2;
foo.r = 3;
foo.a = 4;
printf ("%08x\n", foo.bits);
gives (on Intel):
04030201
This requires changing all your declarations of struct Pixel to union Pixel in your original code. But this defect can be fixed via:
struct Pixel {
union {
struct {unsigned char b,g,r,a;};
uint32_t bits;
};
} foo;
foo.b = 1;
foo.g = 2;
foo.r = 3;
foo.a = 4;
printf ("%08x\n", foo.bits);
This also works with VC9, with 'warning C4201: nonstandard extension used : nameless struct/union'. Microsoft uses this trick, for example, in:
typedef union {
struct {
DWORD LowPart;
LONG HighPart;
}; // <-- nameless member!
struct {
DWORD LowPart;
LONG HighPart;
} u;
LONGLONG QuadPart;
} LARGE_INTEGER;
but they 'cheat' by suppressing the unwanted warning.
While the above examples are ok, if you use this technique too often, you'll quickly end up with unmaintainable code. Five suggestions to make things clearer:
(1) Change the name bits to something uglier like union_bits, to clearly indicate something out-of-the-ordinary.
(2) Go back to the ugly cast the OP rejected, but hide its ugliness in a macro or in an inline function, as in:
#define BITS(x) (*(uint32_t*)&(x))
But this would break the strict aliasing rules. (See, for example, AndreyT's answer: C99 strict aliasing rules in C++ (GCC).)
(3) Keep the original definiton of Pixel, but do a better cast:
struct Pixel {unsigned char b,g,r,a;} foo;
// ...
printf("%08x\n", ((union {struct Pixel dummy; uint32_t bits;})foo).bits);
(4) But that is even uglier. You can fix this by a typedef:
struct Pixel {unsigned char b,g,r,a;} foo;
typedef union {struct Pixel dummy; uint32_t bits;} CastPixelToBits;
// ...
printf("%08x\n", ((CastPixelToBits)foo).bits); // not VC9
With VC9, or with gcc using -pedantic, you'll need (don't use this with gcc--see note at end):
printf("%08x\n", ((CastPixelToBits*)&foo)->bits); // VC9 (not gcc)
(5) A macro may perhaps be preferred. In gcc, you can define a union cast to any given type very neatly:
#define CAST(type, x) (((union {typeof(x) src; type dst;})(x)).dst) // gcc
// ...
printf("%08x\n", CAST(uint32_t, foo));
With VC9 and other compilers, there is no typeof, and pointers may be needed (don't use this with gcc--see note at end):
#define CAST(typeof_x, type, x) (((union {typeof_x src; type dst;}*)&(x))->dst)
Self-documenting, and safer. And not too ugly. All these suggestions are likely to compile to identical code, so efficiency is not an issue. See also my related answer: How to format a function pointer?.
Warning about gcc: The GCC Manual version 4.3.4 (but not version 4.3.0) states that this last example, with &(x), is undefined behaviour. See http://davmac.wordpress.com/2010/01/08/gcc-strict-aliasing-c99/ and http://gcc.gnu.org/ml/gcc/2010-01/msg00013.html.
The problem with a structure inside a union, is that the compiler is allowed to add padding bytes between members of a structure (or class), except bit fields.
Given:
struct Pixel
{
unsigned char red;
unsigned char green;
unsigned char blue;
unsigned char alpha;
};
This could be laid out as:
Offset Field
------ -----
0x00 red
0x04 green
0x08 blue
0x0C alpha
So the size of the structure would be 16 bytes.
When put in a union, the compiler would take the larger capacity of the two to determine space. Also, as you can see, a 32 bit integer would not align correctly.
I suggest creating functions to combine and extract pixels from a 32-bit quantity. You can declare it inline too:
void Int_To_Pixel(const unsigned int word,
Pixel& p)
{
p.red = (word & 0xff000000) >> 24;
p.blue = (word & 0x00ff0000) >> 16;
p.green = (word & 0x0000ff00) >> 8;
p.alpha = (word & 0x000000ff);
return;
}
This is a lot more reliable than a struct inside a union, including one with bit fields:
struct Pixel_Bit_Fields
{
unsigned int red::8;
unsigned int green::8;
unsigned int blue::8;
unsigned int alpha::8;
};
There is still some mystery when reading this whether red is the MSB or alpha is the MSB. By using bit manipulation, there is no question when reading the code.
Just my suggestions, YMMV.
Why not make the ugly mess into an inline routine? Something like:
inline uint32_t pixel32(const Pixel& p)
{
return *reinterpret_cast<uint32_t*>(&p);
}
You could also provide this routine as a member function for Pixel, called i(), which would allow you to access the value via pixel.i() if you preferred to do it that way. (I'd lean on separating the functionality from the data structure when invariants need not be enforced.)

Resources