How does the compiler handle the misalignment? - c

The SO question Does GCC's __attribute__((__packed__))…? mentions that __attribute__((__packed__)) does "packing which introduces alignment issues when accessing the fields of a packed structure. The compiler will account for that when the the fields are accessed directly, but not when they are accessed via pointers".
How does the compiler makes sure that the fields are accessed directly? I suppose it internally add some padding or does some pointer magic. In the case below, how does the compiler makes sure that the y is accessed correctly compared to the pointer?
struct packet {
uint8_t x;
uint32_t y;
} __attribute__((packed));
int main ()
{
uint8_t bytes[5] = {1, 0, 0, 0, 2};
struct packet *p = (struct packet *)bytes;
// compiler handles misalignment because it knows that
// "struct packet" is packed
printf("y=%"PRIX32", ", ntohl(p->y));
// compiler does not handle misalignment - py does not inherit
// the packed attribute
uint32_t *py = &p->y;
printf("*py=%"PRIX32"\n", ntohl(*py));
return 0;
}

When the compiler sees the notation p->y, it knows you're accessing a structure member, and that the structure is packed, because of the declaration of p. It translates this into code that reads byte by byte, and performs the necessary bit shifting to combine them into a uint32_t variable. Essentially, it treats the expression p->y as if it were something like:
*((char*)p+3) << 24 + *((char*)p+2) << 16 + *((char*p)+1) << 8 + *(char*)p
But when you indirect through *py, the compiler doesn't know where the value of that variable came from. It doesn't know that it points into a packed structure, so that it would need to perform this shifting. py is declared to point to uint32_t, which can normally be accessed using an instruction that reads an entire 32-bit word at once. But this instruction expects the pointer to be aligned to a 4-byte boundary, so when you try to do this you'll get a bus error due to the misalignment.

Related

How safe is casting a struct to uint8_t * or char * and accessing it via bytestream?

The following logic works fine but I'm uncertain of the caveats with what the standard says and whether it's totally safe to cast a struct to uint8_t * or char * to send to a message queue (which itself takes in a pointer to the buffer as well) or even a function?
My understanding is as long as uint8_t is considered a byte (which char is), it could be used to address any set of bytes
typedef struct
{
uint8_t a;
uint8_t b;
uint16_t c;
} } __attribute__((packed)) Pkt;
int main()
{
Pkt pkt = {.a = 4, .b = 12, .c = 300};
mq_send(mq, (char *) &pkt, sizeof(pkt), 0);
}
Perhaps it's similar to passing a cast pointer to a function (on the receiver end), and it's parsing the data according to bytes
typedef struct
{
uint8_t a;
uint8_t b;
uint16_t c;
} __attribute__((packed)) Pkt;
void foo(uint8_t *ptr)
{
uint8_t a = *ptr++;
uint8_t b = *ptr++;
uint16_t str = (*(ptr+1) << 8) | *ptr;
printf ("A: %d, B: %d, C: %d\n", a, b, str);
}
int main()
{
Pkt pkt = {.a = 4, .b = 12, .c = 300};
foo((uint8_t *) &pkt);
}
C deliberately allows accessing the bytes of an object and supports communicating objects by transmitting the bytes that represent them and reconstructing them from the transmitted bytes. However, it should be done correctly, and there are some issues to deal with.
A character type should be used.
The preferred type to work with is unsigned char. This is preferred for two reasons:
The C standard defines the behavior of using character types to access the representations of objects. The character types are char, signed char, and unsigned char. The standard does not require that uint8_t be a character type. Although it may have the same size and general properties of unsigned char, it may be an extended integer type rather than an alias of unsigned char (or of char). In this case, the C standard does not define the behavior of accessing the bytes of an object with uint8_t.
unsigned char is preferred over char or signed char to avoid problems with signed integers in various C operations.
The sender and the receiver must agree on the representations of the objects or the protocol used for sending them.
If the sender and the receiver are compiled with the same C implementation using the same definitions for the objects being transmitted (such as the same structure definitions), they will agree on the representations. Between diverse C implementations, though, it is necessary to ensure there is clear agreement on how the transmitted bytes represent objects. As shown in your code, the structure is packed, which should take care of the problem that there may be padding inside structures. Other considerations include:
The order of bytes within integers. Little-endian-first (bytes in order from least significant to most) and big-endian-first (the reverse) are common, although others are possible. Big endian is most common in network protocols.
Representations of non-integers, such as floating-point formats. The IEEE-754 floating-point standard specifies some interchange formats, which are very widely used.
Structures are layed out identically, including the types of members.
Theoretically, the order of bits within bytes must be agreed, but this is not an issue if the network service is operating at the byte level.
Note that, of course, some objects are inherently impossible to send via bytes representations due to needing context in the running program, such as pointers and file handles.
Additional note
Another hazard to guard against is interpreting a byte buffer as another object. The C standard defines the behavior of accessing the bytes of an object (for example, something defined as a structure) using a character type, but it does not define the reverse. Sometimes naïve programmers will create an array of character type, read a network message into it, and then convert a pointer into the array to a pointer to a structure type. This runs afoul of two issues:
The conversion is not defined if the alignment is not correct. (This should not be a problem with a packed array, which we would expect to have an alignment requirement of one byte.)
Accessing an array of characters as a different, incompatible type is not defined by the C standard.
The proper way to reassemble received bytes into an object is to copy them either into memory declared as the desired type or memory allocated (as with malloc) for the purpose of interpreting it as the intended object. This can be done by copying bytes from a buffer into the target memory or by directly passing the target memory to the network read routine, for it to fill in the bytes directly.

Unaligned memory acces with array

In a C program, having an array that is meant to work as a buffer FooBuffer[] for storing the contents of a data member of a struct like this:
struct Foo {
uint64_t data;
};
I was told that this line might cause unaligned access:
uint8_t FooBuffer[10] = {0U};
I have some knowledge that unaligned access depends on the alignment offset of the processor and in general, it consumes more read/write cycles. Under what circumstances would this cause unaligned memory access and how could I prevent it?
Edit:
A variable of type struct Foo would be stored in the buffer. Particularly, its member data would be split up into eight bytes that would be stored in the array FooBuffer. See attached code with some options for this.
#include <stdio.h>
#include <string.h>
typedef unsigned long uint64;
typedef unsigned char uint8;
struct Foo
{
uint64 data;
};
int main()
{
struct Foo foo1 = {0x0123456789001122};
uint8 FooBuffer[10] = {0U};
FooBuffer[0] = (uint8)(foo1.data);
FooBuffer[1] = (uint8)(foo1.data >> 8);
FooBuffer[2] = (uint8)(foo1.data >> 16);
FooBuffer[3] = (uint8)(foo1.data >> 24);
FooBuffer[4] = (uint8)(foo1.data >> 32);
FooBuffer[5] = (uint8)(foo1.data >> 40);
FooBuffer[6] = (uint8)(foo1.data >> 48);
FooBuffer[7] = (uint8)(foo1.data >> 56);
struct Foo foo2 = {0x9876543210112233};
uint8 FooBuffer2[10] = {0U};
memcpy(FooBuffer2, &foo2, sizeof(foo2));
return 0;
}
However, it is not clear how this process is done since a piece of privative software performs the operation. What would be the scenarios that could result in unaligned memory access after the "conversion"?
Defining either a structure such as struct Foo { uint64_t data; } or an array such as uint8_t FooBuffer[10]; and using them in normal ways will not cause an unaligned access. (Why did you use 10 for FooBuffer? Only 8 bytes are needed for this example?)
A method that novices sometimes attempt that can cause unaligned accesses is attempting to reinterpret an array of bytes as a data structure. For example, consider:
// Get raw bytes from network or somewhere.
uint8_t FooBuffer[10];
CallRoutineToReadBytes(FooBuffer,...);
// Reinterpret bytes as original type.
struct Foo x = * (struct Foo *) FooBuffer; // Never do this!
The problem here is that struct Foo has some alignment requirement, but FooBuffer does not. So FooBuffer could be at any address, but the cast to struct Foo * attempts to force it to an address for a struct Foo. If the alignment is not correct, the behavior is not defined by the C standard. Even if the system allows it and the program “works,” it may be accessing a struct Foo at an improperly aligned address and suffering performance problems.
To avoid this, a proper way to reinterpret bytes is to copy them into a new object:
struct Foo x;
memcpy(&x, FooBuffer, sizeof x);
Often a compiler will recognize what is happening here and, especially if struct Foo is not large, implement the memcpy in an efficient way, perhaps as two load-four-byte instructions or one load-eight-byte instruction.
Something you can do to help that along is ask the compiler to align FooBuffer by declaring it with the _Alignas keyword:
uint8_t _Alignas(Struct Foo) FooBuffer[10];
Note that that might not help if you need to take bytes from the middle of a buffer, such as from a network message that includes preceding protocol bytes and other data. And, even if it does give the desired alignment, never use the * (struct Foo *) FooBuffer shown above. It has more problems than just alignment, one of which is that the C standard does not guarantee the behavior of reinterpreting data like this. (A supported way to do it in C is through unions, but memcpy is a fine solution.)
In the code you show, bytes are copied from foo1.data to FooBuffer using bit shifts. This also will not cause alignment problems; expressions that manipulate data like this work just fine. But there are two issues with it. One is that it nominally manipulates individual bytes one by one. That is perfectly legal in C, but it can be slow. A compiler might optimize it, and there might be built-ins or library functions to assist with it, depending on your platform.
The other issue is that it puts the bytes in an order according to their position values: The low-position-value bytes are put into the buffer first. In contrast, the memcpy method copies the bytes in the order they are stored in memory. Which method you want to use depends on the problem you are trying to solve. To store data on one system and read it back later on the same system, the memcpy method is fine. To send data between two systems using the same byte ordering, the memcpy method is fine. However, if you want to send data from one system on the Internet to another, and the two systems do not use the same byte order in memory, you need to agree on an order to use in the network packages. In this case, it is common to use the arrange-bytes-by-position-value method. Again, your platform may have builtins or library routines to assist with this. For example, the htonl and ntohl routines are BSD routines that take a normal 32-bit unsigned integer and return it with its bytes arranged for network order or vice-versa.

The compiler takes the padding bytes of a structure into consideration while reading it

My code has a structure type-defined as follows:
typedef struct
{
Structure_2 a[4];
UCHAR b;
UCHAR c;
}Structure_1;
where the definition of Structure_2 is as follows:
typedef struct
{
ULONG x;
USHORT y;
UCHAR z;
}Structure_2;
There are also two functions in the code. The first one (named setter) declares a structure of type “Structure_1” and fills it with the data:
void setter (void)
{
Structure_1 data_to_send ;
data_to_send.a[0].x = 0x12345678;
data_to_send.a[0].y = 0x1234;
data_to_send.a[0].z = 0x12;
data_to_send.a[1].x = 0x12345678;
data_to_send.a[1].y = 0x1234;
data_to_send.a[1].z = 0x12;
data_to_send.a[2].x = 0x12345678;
data_to_send.a[2].y = 0x1234;
data_to_send.a[2].z = 0x12;
data_to_send.a[3].x = 0x12345678;
data_to_send.a[3].y = 0xAABB;
data_to_send.a[3].z = 0x12;
data_to_send.b =0;
data_to_send.c = 0;
getter(&data_to_send);
}
The compiler saves data_to_send in memory like that:
The second one named getter:
void getter (Structure_1 * ptr_to_data)
{
UCHAR R_1 = ptr_to_data -> b;
UCHAR R_2 = ptr_to_data -> c;
/* The remaining bytes are received */
}
I expect that R_1 will have the value “00”, and R_2 will have the value “00”.
But what happen is the compiler translates the following two lines like that:
/* Get the data at the address ptr_to_data -> b,
which equals the start address of structure + 28 which contains the
value “AA”, and hence R_1 will have “AA” */
UCHAR R_1 = ptr_to_data -> b;
/* Get the data at the address ptr_to_data -> c,
which equals the start *address of structure + 29 which contains the
value “BB”, and hence R_2 will *have “BB” */
UCHAR R_2 = ptr_to_data -> c;
The compiler adds padding b/yte while saving the structure in stack, However when it starts reading it, it forget what it did (and includes the padding bytes in reading).
How could I inform the compiler that you should skip the padding byte while reading the elements of structure ?
I don't want a work around to solve this problem, I am curious to know why the compiler behaves like that ?
My compiler is GreenHills and My target is 32-bit
How could I inform the compiler that you should skip the padding byte while reading the elements of structure ?
Short answer: You cannot.
The compiler will not dis-regard contents contained in your struct. However you can control how it will treat the contents in your struct.
I am curious to know why the compiler behaves like that ?
Short answer: data alignment.
Two issues to consider: data alignment boundaries and data structure padding. You have some control over each:
Data alignment Is the reason your compiler sees what it sees. Data alignment means putting the data at a memory address equal to some multiple of the word size (4 bytes for a 32 bit environment) Even if you do not use explicit padding, the data is stored such that these boundaries are observed, and the size of the struct will indicate padding in the total byte space used.
Structure padding - meaningless bytes placed into a structure to help align the size to be a multiple of word size. You have this in your example code.
You can use pragma macros that cause compiler to pre-process (resolve before compile) packing of a struct a certain way: example #pragma pack(n) simply sets the new alignment. Or, #pragma pack() sets the alignment to the one that was in effect when compilation started.
Example:
#pragma pack(push) /* push current alignment to stack */
#pragma pack(1) /* set alignment to 1 byte boundary */
struct MyPackedData
{
char Data1;
long Data2;
char Data3;
};
#pragma pack(pop) /* restore original alignment from stack */
Note:
The unit of n for pack(n) is byte. Values for n are compiler specific, for MSVC for example are typically 1, 2, 4, 8, and 16.
Question: If you are using prama pack macros, do they use consistent pack values between the getter()/setter() functions? (credit to #alain)
But again, this will not cause the compiler to disregard the contents of your struct, only process it a different way.
See information here and here for more information on root cause of your observations.
The longer version of my comment to #ryykers good answer:
The code you have shown in your question is perfectly valid, there is absolutely no reason why you would get the wrong values when reading the struct members in getter, provided
there is no casting
the same packing rules are in effect
Otherwise the compiler you are using would be severly broken.
The way to set the packing rules differ from compiler to compiler, they are not standardized, so maybe it's not named #pragma pack.
"Normally", there is no reason to interfere with structure packing, but one reason is sending data over a network or to a file. When the structs are packed with no padding at all, you can cast them to a void * or char * and pass the structs directly to a "send" function, for example:
send((void *)&data_to_send, sizeof(data_to_send));
The variable name data_to_send in your question is a hint that this could be what happens in this code. I'm not saying this is good practice, but it's quite common, because you don't have to write serializing code.

Suspicious pointer-to-pointer conversion (area too small) in C

uint8_t *buf;
uint16_t *ptr = (uint16_t *)buf;
I feel the above code is correct but I get "Suspicious pointer-to-pointer conversion (area too small)" Lint warning. Anyone knows how to solve this warning?
IMHO, don't cast the pointer at all. It changes the way (representation) a variable (and corresponding memory) is accessed, which is very problematic.
If you have to, cast the value instead.
Its a problem in your code
In line uint8_t buf you must have assigned it with an memory area which is 8 bits long
and then in line uint16_t *ptr = (uint16_t *)buf; you try to assign it to a pointer that will access it as 16 bits.
When you try to access with ptr variable you are acessing memory beyond what is assigned to you and its a undefined behaviour.
The problem here is strict aliasing. Imagine that when you have a case like this:
uint8_t raw_data [N]; // chunk of raw data
uint16_t val = 1;
memcpy(raw_data, &val, sizeof(val)); // copy an uint16_t into this array
uint16_t* lets_go_crazy = (uint16_t*)raw_data;
*lets_go_crazy = 5;
print(raw_data);
Assume the print function prints all the values of the bytes. You might then think that the two first bytes have changed to contain your machine's representation of an uint16_t containing the value 5. Not necessarily so.
Because the compiler on the other hand is free to assume that you have never modified the raw_data array since the memcpy, because an uint16_t is not allowed to alias an uint8_t. So it might optimize the whole code into something like:
uint8_t raw_data [N]; // chunk of raw data
uint16_t val = 1;
memcpy(raw_data, &val, sizeof(val)); // copy an uint16_t into this array
print(raw_data);
What will happen is undefined, because the strict aliasing was broken.
Though note that your code might be perfectly fine if, and only if, you can guarantee that your particular compiler does not apply strict aliasing to whatever scenario you have. There are compiler options to disable it on some compilers.
Alternatively, you could use a pointer to a struct or union containing an uint16_t, in which case the aliasing does not apply.

Is gcc's __attribute__((packed)) / #pragma pack unsafe?

In C, the compiler will lay out members of a struct in the order in which they're declared, with possible padding bytes inserted between members, or after the last member, to ensure that each member is aligned properly.
gcc provides a language extension, __attribute__((packed)), which tells the compiler not to insert padding, allowing struct members to be misaligned. For example, if the system normally requires all int objects to have 4-byte alignment, __attribute__((packed)) can cause int struct members to be allocated at odd offsets.
Quoting the gcc documentation:
The `packed' attribute specifies that a variable or structure field
should have the smallest possible alignment--one byte for a variable,
and one bit for a field, unless you specify a larger value with the
`aligned' attribute.
Obviously the use of this extension can result in smaller data requirements but slower code, as the compiler must (on some platforms) generate code to access a misaligned member a byte at a time.
But are there any cases where this is unsafe? Does the compiler always generate correct (though slower) code to access misaligned members of packed structs? Is it even possible for it to do so in all cases?
Yes, __attribute__((packed)) is potentially unsafe on some systems. The symptom probably won't show up on an x86, which just makes the problem more insidious; testing on x86 systems won't reveal the problem. (On the x86, misaligned accesses are handled in hardware; if you dereference an int* pointer that points to an odd address, it will be a little slower than if it were properly aligned, but you'll get the correct result.)
On some other systems, such as SPARC, attempting to access a misaligned int object causes a bus error, crashing the program.
There have also been systems where a misaligned access quietly ignores the low-order bits of the address, causing it to access the wrong chunk of memory.
Consider the following program:
#include <stdio.h>
#include <stddef.h>
int main(void)
{
struct foo {
char c;
int x;
} __attribute__((packed));
struct foo arr[2] = { { 'a', 10 }, {'b', 20 } };
int *p0 = &arr[0].x;
int *p1 = &arr[1].x;
printf("sizeof(struct foo) = %d\n", (int)sizeof(struct foo));
printf("offsetof(struct foo, c) = %d\n", (int)offsetof(struct foo, c));
printf("offsetof(struct foo, x) = %d\n", (int)offsetof(struct foo, x));
printf("arr[0].x = %d\n", arr[0].x);
printf("arr[1].x = %d\n", arr[1].x);
printf("p0 = %p\n", (void*)p0);
printf("p1 = %p\n", (void*)p1);
printf("*p0 = %d\n", *p0);
printf("*p1 = %d\n", *p1);
return 0;
}
On x86 Ubuntu with gcc 4.5.2, it produces the following output:
sizeof(struct foo) = 5
offsetof(struct foo, c) = 0
offsetof(struct foo, x) = 1
arr[0].x = 10
arr[1].x = 20
p0 = 0xbffc104f
p1 = 0xbffc1054
*p0 = 10
*p1 = 20
On SPARC Solaris 9 with gcc 4.5.1, it produces the following:
sizeof(struct foo) = 5
offsetof(struct foo, c) = 0
offsetof(struct foo, x) = 1
arr[0].x = 10
arr[1].x = 20
p0 = ffbff317
p1 = ffbff31c
Bus error
In both cases, the program is compiled with no extra options, just gcc packed.c -o packed.
(A program that uses a single struct rather than array doesn't reliably exhibit the problem, since the compiler can allocate the struct on an odd address so the x member is properly aligned. With an array of two struct foo objects, at least one or the other will have a misaligned x member.)
(In this case, p0 points to a misaligned address, because it points to a packed int member following a char member. p1 happens to be correctly aligned, since it points to the same member in the second element of the array, so there are two char objects preceding it -- and on SPARC Solaris the array arr appears to be allocated at an address that is even, but not a multiple of 4.)
When referring to the member x of a struct foo by name, the compiler knows that x is potentially misaligned, and will generate additional code to access it correctly.
Once the address of arr[0].x or arr[1].x has been stored in a pointer object, neither the compiler nor the running program knows that it points to a misaligned int object. It just assumes that it's properly aligned, resulting (on some systems) in a bus error or similar other failure.
Fixing this in gcc would, I believe, be impractical. A general solution would require, for each attempt to dereference a pointer to any type with non-trivial alignment requirements either (a) proving at compile time that the pointer doesn't point to a misaligned member of a packed struct, or (b) generating bulkier and slower code that can handle either aligned or misaligned objects.
I've submitted a gcc bug report. As I said, I don't believe it's practical to fix it, but the documentation should mention it (it currently doesn't).
UPDATE: As of 2018-12-20, this bug is marked as FIXED. The patch will appear in gcc 9 with the addition of a new -Waddress-of-packed-member option, enabled by default.
When address of packed member of struct or union is taken, it may
result in an unaligned pointer value. This patch adds
-Waddress-of-packed-member to check alignment at pointer assignment and warn unaligned address as well as unaligned pointer
I've just built that version of gcc from source. For the above program, it produces these diagnostics:
c.c: In function ‘main’:
c.c:10:15: warning: taking address of packed member of ‘struct foo’ may result in an unaligned pointer value [-Waddress-of-packed-member]
10 | int *p0 = &arr[0].x;
| ^~~~~~~~~
c.c:11:15: warning: taking address of packed member of ‘struct foo’ may result in an unaligned pointer value [-Waddress-of-packed-member]
11 | int *p1 = &arr[1].x;
| ^~~~~~~~~
As ams said above, don't take a pointer to a member of a struct that's packed. This is simply playing with fire. When you say __attribute__((__packed__)) or #pragma pack(1), what you're really saying is "Hey gcc, I really know what I'm doing." When it turns out that you do not, you can't rightly blame the compiler.
Perhaps we can blame the compiler for it's complacency though. While gcc does have a -Wcast-align option, it isn't enabled by default nor with -Wall or -Wextra. This is apparently due to gcc developers considering this type of code to be a brain-dead "abomination" unworthy of addressing -- understandable disdain, but it doesn't help when an inexperienced programmer bumbles into it.
Consider the following:
struct __attribute__((__packed__)) my_struct {
char c;
int i;
};
struct my_struct a = {'a', 123};
struct my_struct *b = &a;
int c = a.i;
int d = b->i;
int *e __attribute__((aligned(1))) = &a.i;
int *f = &a.i;
Here, the type of a is a packed struct (as defined above). Similarly, b is a pointer to a packed struct. The type of of the expression a.i is (basically) an int l-value with 1 byte alignment. c and d are both normal ints. When reading a.i, the compiler generates code for unaligned access. When you read b->i, b's type still knows it's packed, so no problem their either. e is a pointer to a one-byte-aligned int, so the compiler knows how to dereference that correctly as well. But when you make the assignment f = &a.i, you are storing the value of an unaligned int pointer in an aligned int pointer variable -- that's where you went wrong. And I agree, gcc should have this warning enabled by default (not even in -Wall or -Wextra).
It's perfectly safe as long as you always access the values through the struct via the . (dot) or -> notation.
What's not safe is taking the pointer of unaligned data and then accessing it without taking that into account.
Also, even though each item in the struct is known to be unaligned, it's known to be unaligned in a particular way, so the struct as a whole must be aligned as the compiler expects or there'll be trouble (on some platforms, or in future if a new way is invented to optimise unaligned accesses).
Using this attribute is definitely unsafe.
One particular thing it breaks is the ability of a union which contains two or more structs to write one member and read another if the structs have a common initial sequence of members. Section 6.5.2.3 of the C11 standard states:
6 One special guarantee is made in order to simplify the use of unions:
if a union contains several structures that share a common
initial sequence (see below), and if the union object
currently contains one of these structures, it is permitted
to inspect the common initial part of any of them anywhere that a
declaration of the completed type of the union is visible. Tw o
structures share a common initial sequence if corresponding
members have compatible types (and, for bit-fields, the same widths)
for a sequence of one or more initial members.
...
9 EXAMPLE 3 The following is a valid fragment:
union {
struct {
int alltypes;
}n;
struct {
int type;
int intnode;
} ni;
struct {
int type;
double doublenode;
} nf;
}u;
u.nf.type = 1;
u.nf.doublenode = 3.14;
/*
...
*/
if (u.n.alltypes == 1)
if (sin(u.nf.doublenode) == 0.0)
/*
...
*/
When __attribute__((packed)) is introduced it breaks this. The following example was run on Ubuntu 16.04 x64 using gcc 5.4.0 with optimizations disabled:
#include <stdio.h>
#include <stdlib.h>
struct s1
{
short a;
int b;
} __attribute__((packed));
struct s2
{
short a;
int b;
};
union su {
struct s1 x;
struct s2 y;
};
int main()
{
union su s;
s.x.a = 0x1234;
s.x.b = 0x56789abc;
printf("sizeof s1 = %zu, sizeof s2 = %zu\n", sizeof(struct s1), sizeof(struct s2));
printf("s.y.a=%hx, s.y.b=%x\n", s.y.a, s.y.b);
return 0;
}
Output:
sizeof s1 = 6, sizeof s2 = 8
s.y.a=1234, s.y.b=5678
Even though struct s1 and struct s2 have a "common initial sequence", the packing applied to the former means that the corresponding members don't live at the same byte offset. The result is the value written to member x.b is not the same as the value read from member y.b, even though the standard says they should be the same.
(The following is a very artificial example cooked up to illustrate.) One major use of packed structs is where you have a stream of data (say 256 bytes) to which you wish to supply meaning. If I take a smaller example, suppose I have a program running on my Arduino which sends via serial a packet of 16 bytes which have the following meaning:
0: message type (1 byte)
1: target address, MSB
2: target address, LSB
3: data (chars)
...
F: checksum (1 byte)
Then I can declare something like
typedef struct {
uint8_t msgType;
uint16_t targetAddr; // may have to bswap
uint8_t data[12];
uint8_t checksum;
} __attribute__((packed)) myStruct;
and then I can refer to the targetAddr bytes via aStruct.targetAddr rather than fiddling with pointer arithmetic.
Now with alignment stuff happening, taking a void* pointer in memory to the received data and casting it to a myStruct* will not work unless the compiler treats the struct as packed (that is, it stores data in the order specified and uses exactly 16 bytes for this example). There are performance penalties for unaligned reads, so using packed structs for data your program is actively working with is not necessarily a good idea. But when your program is supplied with a list of bytes, packed structs make it easier to write programs which access the contents.
Otherwise you end up using C++ and writing a class with accessor methods and stuff that does pointer arithmetic behind the scenes. In short, packed structs are for dealing efficiently with packed data, and packed data may be what your program is given to work with. For the most part, you code should read values out of the structure, work with them, and write them back when done. All else should be done outside the packed structure. Part of the problem is the low level stuff that C tries to hide from the programmer, and the hoop jumping that is needed if such things really do matter to the programmer. (You almost need a different 'data layout' construct in the language so that you can say 'this thing is 48 bytes long, foo refers to the data 13 bytes in, and should be interpreted thus'; and a separate structured data construct, where you say 'I want a structure containing two ints, called alice and bob, and a float called carol, and I don't care how you implement it' -- in C both these use cases are shoehorned into the struct construct.)

Resources