How can I access variables in packed structure without unaligned mode?

How can I access variables in packed structure without unaligned mode? - c

I'm using packed structure for communication using direct DMA access, and here is my test code:
// structure for communication buf 1
typedef __packed struct _test1
{
uint8_t a;
uint32_t b;
uint16_t c;
uint16_t d;
uint32_t e;
} test1;
// structure for communication buf 2
.
.
.
// structure for communication buf 3
.
.
.
// structure for communication buf set
typedef __packed struct _test2
{
uint8_t dump[3];
test1 t;
// may have many other packed structure for communication buf
} test2;
#pragma anon_unions
typedef struct _test3
{
union
{
uint32_t buf[4];
__packed struct
{
__packed uint8_t dump[3];
test1 t;
};
};
} test3;
test1 t1;
test2 t2;
test3 t3;
size of these structures are
sizeof(t1) = 13
sizeof(t2) = 16
sizeof(t3) = 16
if I want to access variable b, for not effecting performance, read/write memory content with aligned access is needed, with calculated offset by hand
t3.buf[1]
but I cannot read/write variables in structure without using unaligned accesses
t2.t.b
t3.t.b
so I defined structures like the following code, packed only variable a
typedef struct _test4
{
__packed uint8_t a;
uint32_t b;
uint16_t c;
uint16_t d;
uint32_t e;
} test4;
typedef struct _test5
{
__packed uint8_t dump[3];
test4 t;
} test5;
test4 t4;
test5 t5;
although access of all element in structure is aligned, but padding is inserted either
sizeof(t4) = 16
sizeof(t5) = 20
so how can I define packed structures, and access single variable in it without using unaligned access(except a)?
thanks a lot for helping

Your question introduces two problems under the umbrella of one:
Communication between components and/or devices; this may or may not have the same underlying representation of structures and integers, hence your use of the non-portable __packed attribute.
Performance of access, biased by alignment and/or data size; on one hand the compiler aligns data to fall in line with the bus, yet on the other hand that data might occupy too much space in your cache.
One of these is the actual problem you want to solve, X, and the other the Y in your XY problem. Please avoid asking XY problems in the future.
Have you considered how to guarantee that uint16_t and uint32_t will be big endian or little endian, based on your requirements? You need to specify that, if you care about portability. I care about portability, so that's what my answer will focus on. You may also notice how optimal efficiency will be obtained, too. Nonetheless, to make this portable:
You should be serialising your data using serialisation functions to convert each member of your structure into sequences of bytes by division and modulo (or left shift and binary and) operations.
Similarly, you should be deserialising your data by inverse operations multiplication and addition (or right shift and binary or).
As an example, here's some code showing both little endian and big endian for serialising and deserialising test1:
typedef /*__packed*/ struct test1
{
uint32_t b;
uint32_t e;
uint16_t c;
uint16_t d;
uint8_t a;
} test1;
void serialise_test1(test1 *destination, void *source) {
uint8_t *s = source;
destination->a = s[0];
destination->b = s[1] * 0x01000000UL
+ s[2] * 0x00010000UL
+ s[3] * 0x00000100UL
+ s[4]; /* big endian */
destination->c = s[5] * 0x0100U
+ s[6]; /* big endian */
destination->d = s[7]
+ s[8] * 0x0100U; /* little endian */
destination->e = s[9]
+ s[10] * 0x00000100UL
+ s[11] * 0x00010000UL
+ s[12] * 0x01000000UL; /* little endian */
}
void deserialise_test1(void *destination, test1 *source) {
uint8_t temp[] = { source->a
, source->b >> 24, source->b >> 16
, source->b >> 8, source->b
, source->c >> 8, source->c
, source->d, source->d >> 8
, source->d >> 16, source->b >> 24 };
memcpy(destination, temp, sizeof temp);
}
You may notice that I removed the __packed attribute and rearranged the members, so that the larger members precede (i.e. come before) the smaller; this is likely to reduce padding significantly. The functions allow you to convert between an array of uint8_t (which you send to/receive from the wire, or DMA, or whatnot) and your test1 structure, so this code is much more portable. You benefit from the guarantees this code provides regarding the structure of your protocol, where-as before it was at the whim of the implementation, and two devices using two different implementations might have disagreed about the internal representation of integers for example.

You could hard code all the indexes like
typedef __packed struct _test1
{
uint8_t a;
uint32_t b;
uint16_t c;
uint16_t d;
uint32_t e;
} test1;
enum
{
a = 0,
b = 1,
c = 5,
d = 7,
e = 9,
};
test1 t1 = {1,2,3,4};//not sure if init lists work for packed values
printf("%u", *(uint32_t*)((uint8_t*)&t1 + b));
Or offsetof can be used like this
printf("%u", *(uint32_t*)((uint8_t*)&t1 + offsetof(test1, b)));

Related

Pad a packed struct in C for 32-bit alignment

I have a struct defined that is used for messages sent across two different interfaces. One of them requires 32-bit alignment, but I need to minimize the space they take. Essentially I'm trying to byte-pack the structs, i.e. #pragma pack(1) but ensure that the resulting struct is a multiple of 32-bits long. I'm using a gcc arm cross-compiler for a 32-bit M3 processor. What I think I want to do is something like this:
#pragma pack(1)
typedef struct my_type_t
{
uint32_t someVal;
uint8_t anotherVal;
uint8_t reserved[<??>];
}
#pragma pack()
where <??> ensures that the size of my_type_t is divisible by 4 bytes, but without hard-coding the padding size. I can do something like this:
#pragma pack(1)
typedef struct wrapper_t
{
my_type_t m;
uint8_t reserved[sizeof(my_type_t) + 4 - (sizeof(my_type_t) % 4)]
}
#pragma pack()
but I'd like to avoid that.
Ultimately what I need to do is copy this to a buffer that is 32-bit addressable, like:
static my_type_t t; //If it makes a difference, this will be declared statically in my C source file
...
memcpy(bufferPtr, (uint32_t*)&t, sizeof(t)) //or however I should do this
I've looked at the __attribute__((align(N))) attribute, which gives me the 32-bit aligned memory address for the struct, but it does not byte-pack it. I am confused about how (or if) this can be combined with pack(1).
My question is this:
What is the right way to declare these structs so that I can minimize their footprint in memory but that allows me to copy/set it in 4-byte increments with a unsigned 32-bit pointer? (There are a bunch of these types of arbitrary size and content). If my approach above of combining pack and padding is going about this totally wrong, I'll happily take alternatives.
Edit:
Some constraints: I do not have control over one of the interfaces. It is expecting byte-packed frames. The other side is 32-bit addressable memory mapped registers. I have 64k of memory for the entire executable, and I'm limited on the libraries etc. I can bring in. There is already a great deal of space optimization I've had to do.
The struct in this question was just to explain my question. I have numerous messages of varying content that this applies to.

I can't speak for the specific compiler and architecture you are using, but I would expect the following to be sufficient:
typedef struct {
uint32_t x;
uint8_t y;
} my_type_t;
The structure normally has the same alignment as its largest field, and that includes adding the necessary padding at the end.
my_type_t
+---------------+
| x |
+---+-----------+
| y | [padding] |
+---+-----------+
|<-- 32 bits -->|
Demo
This is done so the fields are properly aligned when you have an array of them.
my_type_t my_array[2];
my_array[1].x = 123; // Needs to be properly aligned.
The above assumes you have control over the order of the fields to get the best space efficiency, because it relies on the compiler aligning the individual fields. But those assumptions can be removed using GCC attributes.
typedef struct {
uint8_t x;
uint32_t y;
uint8_t z;
}
__attribute__((packed)) // Remove interfield padding.
__attribute__((aligned(4))) // Set alignment and add tail padding.
my_type_t;
This produces this:
my_type_t
+---+-----------+
| x | y
+---+---+-------+
| z | [pad] |
+---+---+-------+
|<-- 32 bits -->|
Demo
The packed attribute prevents padding from being added between fields, but aligning the structure to a 32-bit boundary forces the alignment you desire. This has the side effect of adding trailing padding so you can safely have an array of these structures.

As you use gcc you need to use one of the attributes.
Example + demo.
#define PACKED __attribute__((packed))
#define ALIGN(n) __attribute__((aligned(n)))
typedef struct
{
uint8_t anotherVal;
uint32_t someVal;
}PACKED my_type_t;
my_type_t t = {1, 5};
ALIGN(64) my_type_t t1 = {1, 5};
ALIGN(512) my_type_t t2 = {2, 6};
int main()
{
printf("%p, %p, %p", (void *)&t, (void *)&t1, (void *)&t2);
}
Result:
0x404400, 0x404440, 0x404600
https://godbolt.org/z/j9YjqzEYW

I suggest combining #pragma pack with alignas:
#include <stdalign.h>
#include <stdint.h>
typedef struct {
#pragma pack(1)
alignas(4) struct { // requires 2+1+2 bytes but is aligned to even 4:s
uint16_t someVal; // +0
uint8_t anotherVal; // +2
uint16_t foo; // +3 (would be 4 without packing)
};
#pragma pack()
} my_type_t;
The anonymous inside struct makes access easy as before:
int main() {
my_type_t y;
y.someVal = 10;
y.anotherVal = 'a';
y.foo = 20;
printf("%zu\n", (char*)&y.someVal - (char*)&y.someVal); // 0
printf("%zu\n", (char*)&y.anotherVal - (char*)&y.someVal); // 2
printf("%zu\n", (char*)&y.foo - (char*)&y.someVal); // 3
my_type_t x[2];
printf("%zu\n", (char*)&x[1] - (char*)&x[0]); // 8 bytes diff
}
If you'd like to be able to take the sizeof the actual data carrying part of my_type_t (to send it), you could name the inner struct (which makes accessing the fields a little more cumbersome):
#pragma pack(1)
typedef struct {
uint16_t someVal;
uint8_t anotherVal;
uint16_t foo;
} inner;
#pragma pack()
typedef struct {
alignas(4) inner i;
} my_type_t;
You'd now have to mention i to access the fields, but it has the benefit that you can take sizeof and get 5 (in this example):
int main() {
my_type_t y;
printf("%zu %zu\n", sizeof y, alignof(y)); // 8 4
printf("%zu\n", sizeof y.i); // 5 (the actual data)
}

To form a structure type that is aligned one must put the alignment attribute to the first member of the struct. It can be combined with the packed attribute.
typedef struct {
_Alignas(4) uint8_t anotherVal;
uint32_t someVal;
} __attribute__((packed)) my_type_t;
Exemplary usage with alignment exaggerated to 64 bytes.
#include <stdint.h>
#include <stdio.h>
typedef struct {
_Alignas(64) uint8_t anotherVal;
uint32_t someVal;
} __attribute__((packed)) my_type_t;
int main() {
my_type_t a, b;
printf("%zu %p\n", sizeof a, (void*)&a);
printf("%zu %p\n", sizeof b, (void*)&b);
}
prints:
64 0x7ffff26caf80
64 0x7ffff26cafc0

Combining two union structs of ARM SoC

I'm trying to combine two typedef unions of a GPIO port of an ARM SoC into one, and address pointers into one. Currently, I have something which looks like this:
.h file:
//GPIO00 port
typedef union {
struct {
uint32_t GPIO000:1;
uint32_t GPIO001:1;
...
uint32_t GPIO0017:1;
};
struct {
uint32_t w:18;
};
} __GPIO00portbits_t;
volatile __GPIO00portbits_t * PTR_GPIO00portbits;
#define GPIO00portbits (*PTR_GPIO00portbits)
//GPIO01 port
typedef union {
struct {
uint32_t GPIO010:1;
uint32_t GPIO011:1;
...
uint32_t GPIO0117:1;
};
struct {
uint32_t w:18;
};
} __GPIO01portbits_t;
volatile __GPIO01portbits_t * PTR_GPIO01portbits;
#define GPIO01portbits (*PTR_GPIO01portbits)
.c file:
//GPIO 00 port
volatile __GPIO00portbits_t * PTR_GPIO00portbits = (__GPIO00portbits_t *) (AXIBRIDGE_BASE_ADDR + GPIO_00_BASE);
//GPIO 01 port
volatile __GPIO01portbits_t * PTR_GPIO01portbits = (__GPIO01portbits_t *) (AXIBRIDGE_BASE_ADDR + GPIO_01_BASE);
}
I can use this to control GPIO ports of the ARM SoC. I.e. I can control a single pin of GPIO00 by changing GPIO00portbits.GPIO00x. It works the same for GPIO01.
In reality, GPIO00 and GPIO01 are actually one port called GPIO0, where GPIO00 is pin 0-17 and GPIO01 is pin 18-35, so I would also like to combine GPIO00 and GPIO01 into one stuct which can be controlled by changing GPIO0portbits.GPIO0x.
So I would like to have something like this:
typedef union {
struct {
uint64_t GPIO00:1 = GPIO00portbits.GPIO000;
uint64_t GPIO01:1 = GPIO00portbits.GPIO001;
...
uint64_t GPIO035:1 = GPIO01portbits.GPIO0117;
};
struct {
uint32_t w:36;
};
} __GPIO0portbits_t;
How can I do this?
Thank you in advance.

Data types generally
You have defined two distinct types, __GPIO00portbits_t and __GPIO01portbits_t, with identical structure and closely related use. This is pointless, and it may even get in your way. I would probably do this, instead:
typedef union {
struct {
uint32_t GPIO0:1;
uint32_t GPIO1:1;
...
uint32_t GPIO17:1;
};
uint32_t w:18;
} __GPIOhalfportbits_t;
extern volatile __GPIOhalfportbits_t *PTR_GPIO00portbits;
#define GPIO00portbits (*PTR_GPIO00portbits)
extern volatile __GPIOhalfportbits_t * PTR_GPIO01portbits;
#define GPIO01portbits (*PTR_GPIO01portbits)
Note, by the way, that you need the externs if the header is going to be used in more than one .c file, and that in that case exactly one of those .c files should contain definitions you show.
Your specific request
I would also like to combine GPIO00 and GPIO01 into one stuct which can be controlled by changing GPIO0portbits.GPIO0x
It seems like you may not be maintaining the appropriate mental distinction between objects and their data types. That would explain your odd duplication of data types, and also the way you describe what you're looking for. If you want to be able to have the option to treat the data as either a full 36 bits or two 18-bit halves, then you could imagine continuing the above with something like this:
// XXX: see below
typedef union {
struct {
__GPIOhalfportbits_t group0;
__GPIOhalfportbits_t group1;
};
struct {
uint32_t GPIO0:1;
uint32_t GPIO1:1;
...
uint32_t GPIO35:1;
};
uint64_t w:36; // uint32_t isn't wide enough
} __GPIOportbits_t;
In principle, then, you might access an object of that type either by directly accessing the bits ...
__GPIOportbits_t portbits;
// ...
if (portbits.GPIO23) {
// ...
}
... or via the half-port pieces ...
if (portbits.group1.GPIO5) {
// ...
}
Something like that might work under different circumstances, but in your case, this will not work. The problem is that the number of bits in your half-port pieces is not a multiple of the number of bits in a char (8 on your hardware). The size of char is the unit in which object sizes are measured, and, accordingly, the finest possible granularity for addresses.
That means that the size of my __GPIOhalfportbits_t and your __GPIO00portbits_t and __GPIO01portbits_t is at least 24 bits, not 18 bits. Therefore, if you lay two of them out one after the other then the bitfields cannot be laid out as a contiguous 36-bit range starting at the beginning of the object. There are at least 6 (padding) bits of the first object that need to go somewhere before the bits of the second half-port object.
For substantially the same reason, there are no pointer tricks that can accomplish what you're after, either. If you have a region of 36 contiguous bits then the second half does not start on an addressible boundary, so you cannot form a pointer to it.
On the other hand, if the two halves are not contiguous in the first place, then you might be able to go with something like this:
typedef struct {
__GPIOhalfportbits_t group0;
__GPIOhalfportbits_t group1;
} __GPIOportbits_t;
You would have to pay attention to alignment of the two half-port pieces, but there is probably an implementation-specific way to get that right. Given that the underlying data (we have now assumed) is not presented as a contiguous span of 36 bits in the first place, forming a union with a 36-bit bitfield does not make sense. It might nevertheless be possible to use a union to map individual single-bit bitfields on top of that pair of structures by inserting explicit padding of the appropriate size, but you need to consider whether any of this is actually worth doing. In particular, see below.
Important other considerations
Bitfields are a tricky business in general, and C makes very few guarantees about their behavior -- many fewer than a lot of people suppose or expect. It is a particularly poor idea to use bitfields to write to hardware ports, because you cannot write fewer than CHAR_BIT bits at once, and if you're writing via a bitfield whose size is not a power-of-two multiple of CHAR_BIT then you will be writing additional bits as well, whose values are unspecified.
I generally recommend avoiding bitfields altogether, except possibly for usage of bitfields in C-language programming interfaces provided by the relevant hardware manufacturer, in a manner consistent with those interfaces' documentation.
Alternatives
You could conceivably come up with some wrapper macros for accessing the GPIO port in terms of two half ports, and even in terms of individual bits within those ports. But this answer is already long, and such a macro-centric approach would be a whole other story.

You can't do that as they live under different addresses in memory.
Using objects to access hardware registers is very inefficient. On this level of programming, you need to optimize code as much as possible.
https://godbolt.org/z/ncbr8o
YOu can only "combine" them by having additional object where you will read the data from that actual registers, and after changes save it to registers.
#include <stdint.h>
#define AXIBRIDGE_BASE_ADDR 0x12340000
#define GPIO_00_BASE 0x400
#define GPIO_01_BASE 0x800
//GPIO00 port
typedef union {
struct {
uint32_t GPIO000:1;
uint32_t GPIO001:1;
uint32_t GPIO002:1;
uint32_t GPIO003:1;
uint32_t GPIO004:1;
uint32_t GPIO005:1;
uint32_t GPIO006:1;
uint32_t GPIO007:1;
uint32_t GPIO008:1;
uint32_t GPIO009:1;
uint32_t GPIO010:1;
uint32_t GPIO011:1;
uint32_t GPIO012:1;
uint32_t GPIO013:1;
uint32_t GPIO014:1;
uint32_t GPIO015:1;
uint32_t GPIO016:1;
uint32_t GPIO017:1;
};
struct {
uint32_t w:18;
};
} __GPIO00portbits_t;
typedef union {
struct {
uint32_t GPIO000:1;
uint32_t GPIO001:1;
uint32_t GPIO002:1;
uint32_t GPIO003:1;
uint32_t GPIO004:1;
uint32_t GPIO005:1;
uint32_t GPIO006:1;
uint32_t GPIO007:1;
uint32_t GPIO008:1;
uint32_t GPIO009:1;
uint32_t GPIO010:1;
uint32_t GPIO011:1;
uint32_t GPIO012:1;
uint32_t GPIO013:1;
uint32_t GPIO014:1;
uint32_t GPIO015:1;
uint32_t GPIO016:1;
uint32_t GPIO017:1;
uint32_t GPIO100:1;
uint32_t GPIO101:1;
uint32_t GPIO102:1;
uint32_t GPIO103:1;
uint32_t GPIO104:1;
uint32_t GPIO105:1;
uint32_t GPIO106:1;
uint32_t GPIO107:1;
uint32_t GPIO108:1;
uint32_t GPIO109:1;
uint32_t GPIO110:1;
uint32_t GPIO111:1;
uint32_t GPIO112:1;
uint32_t GPIO113:1;
uint32_t GPIO114:1;
uint32_t GPIO115:1;
uint32_t GPIO116:1;
uint32_t GPIO117:1;
};
struct {
uint64_t GPIO1w:18;
uint64_t GPIO2w:18;
};
} __GPIO12portbits_t;
#define GPIO1 ((volatile __GPIO00portbits_t *)(AXIBRIDGE_BASE_ADDR + GPIO_00_BASE))
#define GPIO2 ((volatile __GPIO00portbits_t *)(AXIBRIDGE_BASE_ADDR + GPIO_01_BASE))
#define COMBINE() (&(__GPIO12portbits_t){.GPIO1w = GPIO1 -> w, .GPIO2w = GPIO2 -> w})
#define UPDATEGPIO(ptr) do{GPIO1 -> w = ptr -> GPIO1w; GPIO2 -> w = ptr -> GPIO2w;}while(0)
void foo()
{
__GPIO12portbits_t *ptr = COMBINE();
ptr -> GPIO014 = 1;
ptr -> GPIO110 = 1;
UPDATEGPIO(ptr);
}
void bar()
{
GPIO1 -> GPIO014 = 1;
GPIO2 -> GPIO010 = 1;
}
But it is very inefficient https://godbolt.org/z/jMsc7j

Writing structs to a socket

So I have some structs containing data that I want to send to another process using a unix socket. This process may not be compiled using the same compiler version, or even be written in C for that matter. This is the struct (note that some stuff is commented out):
struct nested_struct {
uint8_t a;
uint8_t b;
uint16_t c;
} */__attribute__((packed))*/;
struct my_struct {
uint32_t num_nested_structs;
/* uint8_t padding[3];*/
uint8_t x;
uint16_t y;
uint16_t z;
struct nested_struct nested[];
} /*__attribute__((packed))*/;
For convenience and performance, I'd like to get away with something like
write(socket, &data.x, data.num_nested_structs * sizeof(struct nested_struct) + 5)
or something -- but I doubt this would be safe, given that struct my_struct is not nicely aligned. But how about if we un-comment the packed attribute? This feels like it should work, but I've read that referencing fields in __packed__ structs by address can be dangerous.
What if we instead uncomment the uint8_t padding[3]; field? Now both structs are word size-aligned (on a system with WORD_BIT = 32). Is it safe to assume that the compiler won't add any padding in this case? If so, is this enough to ensure that accessing 5 + 4*num_nested_structs bytes of memory starting from &my_struct.x is safe?

Struct data alignment. Size must be an integer multiple of the largest type present?

I am making a GUI that send/receives data over a serial port. The data consists of messages that are defined in a struct like this:
typedef struct
{
uint8_t a;
uint8_t b;
uint8_t c;
uint16_t d[3];
uint16_t e;
} MyMsg_t;
I am also using a union, because it makes it easier for me to be able to set the data fields and be able to send it byte for byte. The union looks like this:
typedef union
{
MyMsg_t msg;
uint8_t array[MyMsgLength];
} MyMsg;
I now try to add some data to a message like this:
MyMsg msg;
msg.msg.a = (uint8_t) 1;
msg.msg.b = (uint8_t) 2;
msg.msg.c = (uint8_t) 3;
msg.msg.d[0] = (uint16_t) 4;
msg.msg.d[1] = (uint16_t) 5;
msg.msg.d[2] = (uint16_t) 6;
msg.msg.e = (uint16_t) 7;
And I transmit it over a serial bus byte-wise and the receiving end is:
1 2 3 19 4 0 5 0 6 0 7
(the data in c through e is reversed because of the bus)
This looks like the struct actually was:
typedef struct
{
uint8_t a; //1
uint8_t b; //2
uint8_t c; //3
//uint8_t x //19
uint16_t d[3];
uint16_t e;
} MyMsg_t;
From this I can assume that it somewhere in the C standard says that the struct must be minimum n * sizeof(uint16_t) in this case as we can not have for example 3.5 of the largest type in a struct, but it has to be an integer?
I guess this is what is called padding? Is there some way to force a struct to be n * sizeof(uint8_t) even when a larger type is present?
I know how I can avoid this, but it requires not using the union and more code. Is there some way to elegantly avoid this issue with minimal code intervention?

Don't forget about the endianness if the machine too, which is why doing this is not a recommended practice. If you don't care about endianness or are dealing with it some other way, then this approach is acceptable.
You can change the alignment to eliminate padding. How this is done depends on the compiler, however:
Microsoft Visual C++ - #pragma pack
gcc - __attribute__(packed), also supports #pragma pack
Other compilers may have other options or means of accomplishing the same thing.

Using unions to simplify casts

I realize that what I am trying to do isn't safe. But I am just doing some testing and image processing so my focus here is on speed.
Right now this code gives me the corresponding bytes for a 32-bit pixel value type.
struct Pixel {
unsigned char b,g,r,a;
};
I wanted to check if I have a pixel that is under a certain value (e.g. r, g, b <= 0x10). I figured I wanted to just conditional-test the bit-and of the bits of the pixel with 0x00E0E0E0 (I could have wrong endianness here) to get the dark pixels.
Rather than using this ugly mess (*((uint32_t*)&pixel)) to get the 32-bit unsigned int value, i figured there should be a way for me to set it up so I can just use pixel.i, while keeping the ability to reference the green byte using pixel.g.
Can I do this? This won't work:
struct Pixel {
unsigned char b,g,r,a;
};
union Pixel_u {
Pixel p;
uint32_t bits;
};
I would need to edit my existing code to say pixel.p.g to get the green color byte. Same happens if I do this:
union Pixel {
unsigned char c[4];
uint32_t bits;
};
This would work too but I still need to change everything to index into c, which is a bit ugly but I can make it work with a macro if i really needed to.

(Edited) Both gcc and MSVC allow 'anonymous' structs/unions, which might solve your problem. For example:
union Pixel {
struct {unsigned char b,g,r,a;};
uint32_t bits; // use 'unsigned' for MSVC
}
foo.b = 1;
foo.g = 2;
foo.r = 3;
foo.a = 4;
printf ("%08x\n", foo.bits);
gives (on Intel):
04030201
This requires changing all your declarations of struct Pixel to union Pixel in your original code. But this defect can be fixed via:
struct Pixel {
union {
struct {unsigned char b,g,r,a;};
uint32_t bits;
};
} foo;
foo.b = 1;
foo.g = 2;
foo.r = 3;
foo.a = 4;
printf ("%08x\n", foo.bits);
This also works with VC9, with 'warning C4201: nonstandard extension used : nameless struct/union'. Microsoft uses this trick, for example, in:
typedef union {
struct {
DWORD LowPart;
LONG HighPart;
}; // <-- nameless member!
struct {
DWORD LowPart;
LONG HighPart;
} u;
LONGLONG QuadPart;
} LARGE_INTEGER;
but they 'cheat' by suppressing the unwanted warning.
While the above examples are ok, if you use this technique too often, you'll quickly end up with unmaintainable code. Five suggestions to make things clearer:
(1) Change the name bits to something uglier like union_bits, to clearly indicate something out-of-the-ordinary.
(2) Go back to the ugly cast the OP rejected, but hide its ugliness in a macro or in an inline function, as in:
#define BITS(x) (*(uint32_t*)&(x))
But this would break the strict aliasing rules. (See, for example, AndreyT's answer: C99 strict aliasing rules in C++ (GCC).)
(3) Keep the original definiton of Pixel, but do a better cast:
struct Pixel {unsigned char b,g,r,a;} foo;
// ...
printf("%08x\n", ((union {struct Pixel dummy; uint32_t bits;})foo).bits);
(4) But that is even uglier. You can fix this by a typedef:
struct Pixel {unsigned char b,g,r,a;} foo;
typedef union {struct Pixel dummy; uint32_t bits;} CastPixelToBits;
// ...
printf("%08x\n", ((CastPixelToBits)foo).bits); // not VC9
With VC9, or with gcc using -pedantic, you'll need (don't use this with gcc--see note at end):
printf("%08x\n", ((CastPixelToBits*)&foo)->bits); // VC9 (not gcc)
(5) A macro may perhaps be preferred. In gcc, you can define a union cast to any given type very neatly:
#define CAST(type, x) (((union {typeof(x) src; type dst;})(x)).dst) // gcc
// ...
printf("%08x\n", CAST(uint32_t, foo));
With VC9 and other compilers, there is no typeof, and pointers may be needed (don't use this with gcc--see note at end):
#define CAST(typeof_x, type, x) (((union {typeof_x src; type dst;}*)&(x))->dst)
Self-documenting, and safer. And not too ugly. All these suggestions are likely to compile to identical code, so efficiency is not an issue. See also my related answer: How to format a function pointer?.
Warning about gcc: The GCC Manual version 4.3.4 (but not version 4.3.0) states that this last example, with &(x), is undefined behaviour. See http://davmac.wordpress.com/2010/01/08/gcc-strict-aliasing-c99/ and http://gcc.gnu.org/ml/gcc/2010-01/msg00013.html.

The problem with a structure inside a union, is that the compiler is allowed to add padding bytes between members of a structure (or class), except bit fields.
Given:
struct Pixel
{
unsigned char red;
unsigned char green;
unsigned char blue;
unsigned char alpha;
};
This could be laid out as:
Offset Field
------ -----
0x00 red
0x04 green
0x08 blue
0x0C alpha
So the size of the structure would be 16 bytes.
When put in a union, the compiler would take the larger capacity of the two to determine space. Also, as you can see, a 32 bit integer would not align correctly.
I suggest creating functions to combine and extract pixels from a 32-bit quantity. You can declare it inline too:
void Int_To_Pixel(const unsigned int word,
Pixel& p)
{
p.red = (word & 0xff000000) >> 24;
p.blue = (word & 0x00ff0000) >> 16;
p.green = (word & 0x0000ff00) >> 8;
p.alpha = (word & 0x000000ff);
return;
}
This is a lot more reliable than a struct inside a union, including one with bit fields:
struct Pixel_Bit_Fields
{
unsigned int red::8;
unsigned int green::8;
unsigned int blue::8;
unsigned int alpha::8;
};
There is still some mystery when reading this whether red is the MSB or alpha is the MSB. By using bit manipulation, there is no question when reading the code.
Just my suggestions, YMMV.

Why not make the ugly mess into an inline routine? Something like:
inline uint32_t pixel32(const Pixel& p)
{
return *reinterpret_cast<uint32_t*>(&p);
}
You could also provide this routine as a member function for Pixel, called i(), which would allow you to access the value via pixel.i() if you preferred to do it that way. (I'd lean on separating the functionality from the data structure when invariants need not be enforced.)

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight