This question already has answers here:
memory alignment within gcc structs
(6 answers)
Closed 6 years ago.
I’m unsure on whether it’s normal or it’s a compiler bug but I have a C struct with lot of members. Among of them, there’s, :
struct list {
...
...
const unsigned char nop=0x90; // 27 bytes since the begining of the structure
const unsigned char jump=0xeb; // 28 bytes since the begining of the structure
const unsigned char hlt=0xf4; // 29 bytes since the begining of the structure
unsigned __int128 i=0xeb90eb90eb90eb90f4f4 // should start at the 30th byte, but get aligned on a 16 byte boundary and starts on the 32th byte instead
const unsigned char data=0x66; // should start at the 46th byte, but start on the 48th instead.
}; // end of struct list.
I had a hard time to find out why my program wasn’t working, but I finally found there’s a 2 bytes gap between hltand i which is set to 0x0. This means that the i is getting aligned.
This is very clear when I printf that part of the structure, because with :
for(int i=28;i<35;i++)
printf("%02hhX",buf[i]);
I get EBF40000EB90EB90 on the screen.
I tried things like volatile struct list data;in my program, but it didn’t changed the alignment problem.
So is there a #pragma or a __attribute__to tell gcc to not align i inside struct listtype ?
In GCC you can use __attribute__((packed)) like this:
// sizeof(x) == 8
struct x
{
char x;
int a;
};
// sizeof(y) == 5
struct y
{
char x;
int a;
} __attribute__((packed));
See doc.
Also if you rely on the addresses of struct fields, take a look at the offsetof macro. Maybe you don't need to pack the structure at all.
As touched on by #Banex
#pragma pack(push,1)
struct
{
char a;
int b;
long long c;
} foo;
#pragma pack(pop)
The #pragma pack(push,1) pushes the current packing mode internally, and sets packing to 1, no padding
The #pragma pack(pop) restores the previous packing
Supposedly compatible with Microsoft's syntax
http://gcc.gnu.org/onlinedocs/gcc-4.4.4/gcc/Structure_002dPacking-Pragmas.html
The fields within the struct are padded in an implementation defined manner.
That being said, fields are typically aligned on an offest which is a multiple of the size of the data member (or array element if the member is an array) in question. So a 16 bit field starts on a 2 byte offset, 32 bit field starts on a 4 byte offset, and so forth.
If you reorder the fields in your struct to adhere to this guideline, you can typically avoid having any internal padding within the struct (although you may end up with some trailing padding).
By putting the fields at the proper offset, there can be performance gains over forcefully packing the struct.
For more details, see this article on structure packing.
While using the above techniques are not guaranteed, they tend to work in most cases.
Related
I am using a for loop to copy an array from a UART RX buffer to memory.
This looks like as follows:
UART2_readTimeout(uartR2, rxBuf3, 54, NULL, 500);
GPIO_toggle(CONFIG_GPIO_LED_3);
if ((rxBuf3[4] == 0x8C) && (rxBuf3[10] != 0x8C)) {
int i;
for (i = 0; i < 47; i++) {
sioR2[i]=rxBuf3[i];
}
I want to then use a struct such as the following to make it possible to use dot notation when working with and organizing the data:
typedef struct
{
uint16_t voltage;
uint16_t current;
uint16_t outTemp;
uint16_t inTemp;
uint16_t status;
uint32_t FaultA;
uint32_t FaultB;
uint32_t FaultC;
uint32_t FaultD;
uint8_t softwareMode;
uint8_t logicLoad;
uint8_t outputBits;
uint16_t powerOut;
uint32_t runHours;
uint16_t unitAddresses[6];
} unitValues;
Assuming the total length of these are the same, is it possible to perform a memcpy on the entire array to a single instance of the struct?
Array : 001110101....110001
||||||||||||||||||| <- memcpy
vvvvvvvvvvvvvvvvvvv
Struct : 001110101....110001
Provided that your C implementation offers a way to ensure that the layout of your structure is the same as the layout that the driver in question uses for writing the buffer, a pretty good way to go about this would be to have the driver write directly into the structure. I'm inferring the signature of the driver function here, but that would probably be something like:
UART2_readTimeout(uartR2, (uint8_t *) &values, 54, NULL, 500);
Assuming that uint8_t is an alias for unsigned char or maybe char, it is valid to write into the representation of the structure via a pointer of type uint8_t *. Thus, this avoids you having to make a copy.
The trick, however, is the structure layout. Supposing that you expect the data to be laid out as the structure members given, in the order given, with no gaps, such a structure layout would prevent structure instances being positioned in memory so that all members are aligned on addresses that are multiples of their sizes. Depending on the alignment rules of your hardware, this might be perfectly fine, but probably either it would slow accesses to some of the members, or it would make attempts to access some of the members crash the program.
if you still want to proceed then you will need to check your compiler's documentation for information about how to get the wanted layout of your structure. You might look for references to structure "packing", structure layout, or structure member alignment. There is no standard way to do this -- if your C implementation supports it at all then that constitutes an extension, with implementation-specific details.
All the same issues and caveats would apply to using memcpy to copy the buffer contents onto an instance of your structure type, so if you don't multiple copies of the data and you can arrange to make bulk copy onto the structure work, then you're better off writing directly onto the structure than writing into a separate buffer and then copying.
On the other hand, the safe and standard alternative would be to allow your implementation to lay out the structure however it thinks is best, And to copy the data out of your buffer into the structure in member-by-member fashion, with per-member memcpy()s. Yes, the code will be a bit tedious, but it will not be sensitive to alignment-related issues, nor even to reordering structure members or adding new ones.
You have to change the packing align to 1 byte for the structure.
#pragma pack(1) /* change */
typedef struct {
...
}
#pragma pack() /* restore */
In theory, you can use memcpy() to set the member fields of a struct from the elements of a byte array. However, you will need to be very careful to prevent your compiler from adding 'empty' fields to your struct (see: Structure padding and packing) unless those empty fields are taken into account when loading the data into the source array. (The elements of the source array will be packed into contiguous memory.)
Different compilers use different command-line and/or #pragma options to control structure packing but, for the MSVC compiler, you can use the #pragma pack(n) directives in your source code or the /Zp command-line switch.
Using the MSVC compiler, the structure you have provided will have a total size of 47 bytes only if you have single-byte packing; for default packing, the size will be 52 bytes.
The following code block shows where these 'extra' bytes will be inserted for different packing sizes.
#pragma pack(push, 1) // This saves the current packing level then sets it to "n" (1, here)
typedef struct {
uint16_t voltage;
uint16_t current;
uint16_t outTemp;
uint16_t inTemp;
uint16_t status;
// 4+ byte packing will insert two bytes here
uint32_t FaultA;
uint32_t FaultB;
uint32_t FaultC;
uint32_t FaultD;
uint8_t softwareMode;
uint8_t logicLoad;
uint8_t outputBits;
// 2+ byte packing will insert one byte here
uint16_t powerOut;
// 4+ byte packing will insert two bytes here
uint32_t runHours;
uint16_t unitAddresses[6];
} unitValues;
#pragma pack(pop) // This restores the previous packing level
So, the sizeof(unitValues) will be:
47 bytes when using #pragma pack(1)
48 bytes when using #pragma pack(2)
52 bytes when using #pragma pack(4) (or any higher/default value)
I am porting an application to an ARM platform in C, the application also runs on an x86 processor, and must be backward compatible.
I am now having some issues with variable alignment. I have read the gcc manual for
__attribute__((aligned(4),packed)) I interpret what is being said as the start of the struct is aligned to the 4 byte boundry and the inside remains untouched because of the packed statement.
originally I had this but occasionally it gets placed unaligned with the 4 byte boundary.
typedef struct
{
unsigned int code;
unsigned int length;
unsigned int seq;
unsigned int request;
unsigned char nonce[16];
unsigned short crc;
} __attribute__((packed)) CHALLENGE;
so I change it to this.
typedef struct
{
unsigned int code;
unsigned int length;
unsigned int seq;
unsigned int request;
unsigned char nonce[16];
unsigned short crc;
} __attribute__((aligned(4),packed)) CHALLENGE;
The understand I stated earlier seems to be incorrect as both the struct is now aligned to a 4 byte boundary, and and the inside data is now aligned to a four byte boundary, but because of the endianess, the size of the struct has increased in size from 42 to 44 bytes. This size is critical as we have other applications that depend on the struct being 42 bytes.
Could some describe to me how to perform the operation that I require. Any help is much appreciated.
If you're depending on sizeof(yourstruct) being 42 bytes, you're about to be bitten by a world of non-portable assumptions. You haven't said what this is for, but it seems likely that the endianness of the struct contents matters as well, so you may also have a mismatch with the x86 there too.
In this situation I think the only sure-fire way to cope is to use unsigned char[42] in the parts where it matters. Start by writing a precise specification of exactly what fields are where in this 42-byte block, and what endian, then use that definition to write some code to translate between that and a struct you can interact with. The code will likely be either all-at-once serialisation code (aka marshalling), or a bunch of getters and setters.
This is one reason why reading whole structs instead of memberwise fails, and should be avoided.
In this case, packing plus aligning at 4 means there will be two bytes of padding. This happens because the size must be compatible for storing the type in an array with all items still aligned at 4.
I imagine you have something like:
read(fd, &obj, sizeof obj)
Because you don't want to read those 2 padding bytes which belong to different data, you have to specify the size explicitly:
read(fd, &obj, 42)
Which you can keep maintainable:
typedef struct {
//...
enum { read_size = 42 };
} __attribute__((aligned(4),packed)) CHALLENGE;
// ...
read(fd, &obj, obj.read_size)
Or, if you can't use some features of C++ in your C:
typedef struct {
//...
} __attribute__((aligned(4),packed)) CHALLENGE;
enum { CHALLENGE_read_size = 42 };
// ...
read(fd, &obj, CHALLENGE_read_size)
At the next refactoring opportunity, I would strongly suggest you start reading each member individually, which can easily be encapsulated within a function.
I've been moving structures back and forth from Linux, Windows, Mac, C, Swift, Assembly, etc.
The problem is NOT that it can't be done, the problem is that you can't be lazy and must understand your tools.
I don't see why you can't use:
typedef struct
{
unsigned int code;
unsigned int length;
unsigned int seq;
unsigned int request;
unsigned char nonce[16];
unsigned short crc;
} __attribute__((packed)) CHALLENGE;
You can use it and it doesn't require any special or clever code. I write a LOT of code that communicates to ARM. Structures are what make things work. __attribute__ ((packed)) is my friend.
The odds of being in a "world of hurt" are nil if you understand what is going on with both.
Finally, I can't for the life make out how you get 42 or 44. Int is either 4 or 8 bytes (depending on the compiler). That puts the number at either 16+16+2=34 or 32+16+2=50 -- assuming it is truly packed.
As I say, knowing your tools is part of your problem.
What is your true goal?
If it's to deal with data that's in a file or on the wire in a particular format what you should do is write up some marshaling/serialization routines that move the data between the compiler struct that represents how you want to deal with the data inside the program and a char array that deals with how the data looks on the wire/file.
Then all that needs to be dealt with carefully and possibly have platform specific code is the marshaling routines. And you can write some nice-n-nasty unit tests to ensure that the marshaled data gets to and from the struct properly no matter what platform you might have to port to today and in the future.
I would guess that the problem is that 42 isn't divisible by 4, and so they get out of alignment if you put several of these structs back to back (e.g. allocate memory for several of them, determining the size with sizeof). Having the size as 44 forces the alignment in these cases as you requested. However, if the internal offset of each struct member remains the same, you can treat the 44 byte struct as though it was 42 bytes (as long as you take care to align any following data at the correct boundary).
One trick to try might be putting both of these structs inside a single union type and only use 42-byte version from within each such union.
As I am using linux, I have found that by echo 3 > /proc/cpu/alignment it will issue me with a warning, and fix the alignment issue. This is a work around but it is very helpful with locating where the structures are failing to be misaligned.
In GCC C, how can I get an array of packed structs, where each entry in the array is aligned?
(FWIW, this is on a PIC32, using the MIPS4000 architecture).
I have this (simplified):
typedef struct __attribute__((packed))
{
uint8_t frameLength;
unsigned frameType :3;
uint8_t payloadBytes;
uint8_t payload[RADIO_RX_PAYLOAD_WORST];
} RADIO_PACKET;
RADIO_PACKETs are packed internally.
Then I have RADIO_PACKET_QUEUE, which is a queue of RADIO_PACKETs:
typedef struct
{
short read; // next buffer to read
short write; // next buffer to write
short count; // number of packets stored in q
short buffers; // number of buffers allocated in q
RADIO_PACKET q[RADIO_RX_PACKET_BUFFERS];
} RADIO_PACKET_QUEUE;
I want each RADIO_PACKET in the array q[] to start at an aligned address (modulo 4 address).
But right now GCC doesn't align them, so I get address exceptions when trying to read q[n] as a word. For example, this gives an exception:
RADIO_PACKET_QUEUE rpq;
int foo = *(int*) &(rpq.q[1]);
Perhaps this is because of the way I declared RADIO_PACKET as packed.
I want each RADIO_PACKET to remain packed internally, but want GCC to add padding as needed after each array element so each RADIO_PACKET starts at an aligned address.
How can I do this?
Since you specify that you are using GCC, you should be looking at type attributes. In particular, if you want RADIO_PACKETs to be aligned on 4-byte (or wider) boundaries, then you would use __attribute__((aligned (4))) on the type. When applied to a struct, it describes the alignment of instances of the overall struct, not (directly) the alignment of any individual members, so it is possible to use it together with attribute packed:
typedef struct __attribute__((aligned(4), packed))
{
uint8_t frameLength;
unsigned frameType :3;
uint8_t payloadBytes;
uint8_t payload[RADIO_RX_PAYLOAD_WORST];
} RADIO_PACKET;
The packed attribute prevents padding between structure elements, but it does not prevent trailing padding in the structure representation, exactly as is necessary for ensuring the required alignment for every element of an array of the specified type. You should not then need to do anything special in the declaration of RADIO_PACKET_QUEUE.
This is a lot cleaner and clearer than the alternative you came up with, but it is GCC-specific. Since you were GCC-specific already, I don't see that being a problem.
You can wrap your packed structure, inside another unaligned structure. Then you do array from this unaligned structures.
Solution 2 could be, to add dummy member char[] at the end of the packed structure. In this case you will need to calculate it somehow, probably manually.
I also will suggest you to rearrange your structure, by placing longer members first and placing uint8_t members last (assuming you have 16/32 bit members and not doing some HW mapping).
I think I've solved this, based on the hint from #Nick in the question comments.
I added a RADIO_PACKET_ALIGNED wrapper around RADIO_PACKET. This includes calculated padding.
Then I substituted RADIO_PACKET_ALIGNED for RADIO_PACKET in the RADIO_PACKET_QUEUE structure.
Seems to work:
typedef struct
{
RADIO_PACKET packet;
uint8_t padding[3 - (sizeof(RADIO_PACKET) + 3) % 4];
} RADIO_PACKET_ALIGNED;
typedef struct
{
short read; // next buffer to read
short write; // next buffer to write
short count; // number of packets stored in q
short buffers; // number of buffers allocated in q
RADIO_PACKET_ALIGNED q[RADIO_RX_PACKET_BUFFERS];
} RADIO_PACKET_QUEUE;
Thanks to all the commenters!
Edit: A more portable version of the wrapper would use:
uint8_t padding[(sizeof(int) - 1) - (sizeof(RADIO_PACKET) + (sizeof(int) - 1)) % sizeof(int)];
My code has a structure type-defined as follows:
typedef struct
{
Structure_2 a[4];
UCHAR b;
UCHAR c;
}Structure_1;
where the definition of Structure_2 is as follows:
typedef struct
{
ULONG x;
USHORT y;
UCHAR z;
}Structure_2;
There are also two functions in the code. The first one (named setter) declares a structure of type “Structure_1” and fills it with the data:
void setter (void)
{
Structure_1 data_to_send ;
data_to_send.a[0].x = 0x12345678;
data_to_send.a[0].y = 0x1234;
data_to_send.a[0].z = 0x12;
data_to_send.a[1].x = 0x12345678;
data_to_send.a[1].y = 0x1234;
data_to_send.a[1].z = 0x12;
data_to_send.a[2].x = 0x12345678;
data_to_send.a[2].y = 0x1234;
data_to_send.a[2].z = 0x12;
data_to_send.a[3].x = 0x12345678;
data_to_send.a[3].y = 0xAABB;
data_to_send.a[3].z = 0x12;
data_to_send.b =0;
data_to_send.c = 0;
getter(&data_to_send);
}
The compiler saves data_to_send in memory like that:
The second one named getter:
void getter (Structure_1 * ptr_to_data)
{
UCHAR R_1 = ptr_to_data -> b;
UCHAR R_2 = ptr_to_data -> c;
/* The remaining bytes are received */
}
I expect that R_1 will have the value “00”, and R_2 will have the value “00”.
But what happen is the compiler translates the following two lines like that:
/* Get the data at the address ptr_to_data -> b,
which equals the start address of structure + 28 which contains the
value “AA”, and hence R_1 will have “AA” */
UCHAR R_1 = ptr_to_data -> b;
/* Get the data at the address ptr_to_data -> c,
which equals the start *address of structure + 29 which contains the
value “BB”, and hence R_2 will *have “BB” */
UCHAR R_2 = ptr_to_data -> c;
The compiler adds padding b/yte while saving the structure in stack, However when it starts reading it, it forget what it did (and includes the padding bytes in reading).
How could I inform the compiler that you should skip the padding byte while reading the elements of structure ?
I don't want a work around to solve this problem, I am curious to know why the compiler behaves like that ?
My compiler is GreenHills and My target is 32-bit
How could I inform the compiler that you should skip the padding byte while reading the elements of structure ?
Short answer: You cannot.
The compiler will not dis-regard contents contained in your struct. However you can control how it will treat the contents in your struct.
I am curious to know why the compiler behaves like that ?
Short answer: data alignment.
Two issues to consider: data alignment boundaries and data structure padding. You have some control over each:
Data alignment Is the reason your compiler sees what it sees. Data alignment means putting the data at a memory address equal to some multiple of the word size (4 bytes for a 32 bit environment) Even if you do not use explicit padding, the data is stored such that these boundaries are observed, and the size of the struct will indicate padding in the total byte space used.
Structure padding - meaningless bytes placed into a structure to help align the size to be a multiple of word size. You have this in your example code.
You can use pragma macros that cause compiler to pre-process (resolve before compile) packing of a struct a certain way: example #pragma pack(n) simply sets the new alignment. Or, #pragma pack() sets the alignment to the one that was in effect when compilation started.
Example:
#pragma pack(push) /* push current alignment to stack */
#pragma pack(1) /* set alignment to 1 byte boundary */
struct MyPackedData
{
char Data1;
long Data2;
char Data3;
};
#pragma pack(pop) /* restore original alignment from stack */
Note:
The unit of n for pack(n) is byte. Values for n are compiler specific, for MSVC for example are typically 1, 2, 4, 8, and 16.
Question: If you are using prama pack macros, do they use consistent pack values between the getter()/setter() functions? (credit to #alain)
But again, this will not cause the compiler to disregard the contents of your struct, only process it a different way.
See information here and here for more information on root cause of your observations.
The longer version of my comment to #ryykers good answer:
The code you have shown in your question is perfectly valid, there is absolutely no reason why you would get the wrong values when reading the struct members in getter, provided
there is no casting
the same packing rules are in effect
Otherwise the compiler you are using would be severly broken.
The way to set the packing rules differ from compiler to compiler, they are not standardized, so maybe it's not named #pragma pack.
"Normally", there is no reason to interfere with structure packing, but one reason is sending data over a network or to a file. When the structs are packed with no padding at all, you can cast them to a void * or char * and pass the structs directly to a "send" function, for example:
send((void *)&data_to_send, sizeof(data_to_send));
The variable name data_to_send in your question is a hint that this could be what happens in this code. I'm not saying this is good practice, but it's quite common, because you don't have to write serializing code.
I am porting an application to an ARM platform in C, the application also runs on an x86 processor, and must be backward compatible.
I am now having some issues with variable alignment. I have read the gcc manual for
__attribute__((aligned(4),packed)) I interpret what is being said as the start of the struct is aligned to the 4 byte boundry and the inside remains untouched because of the packed statement.
originally I had this but occasionally it gets placed unaligned with the 4 byte boundary.
typedef struct
{
unsigned int code;
unsigned int length;
unsigned int seq;
unsigned int request;
unsigned char nonce[16];
unsigned short crc;
} __attribute__((packed)) CHALLENGE;
so I change it to this.
typedef struct
{
unsigned int code;
unsigned int length;
unsigned int seq;
unsigned int request;
unsigned char nonce[16];
unsigned short crc;
} __attribute__((aligned(4),packed)) CHALLENGE;
The understand I stated earlier seems to be incorrect as both the struct is now aligned to a 4 byte boundary, and and the inside data is now aligned to a four byte boundary, but because of the endianess, the size of the struct has increased in size from 42 to 44 bytes. This size is critical as we have other applications that depend on the struct being 42 bytes.
Could some describe to me how to perform the operation that I require. Any help is much appreciated.
If you're depending on sizeof(yourstruct) being 42 bytes, you're about to be bitten by a world of non-portable assumptions. You haven't said what this is for, but it seems likely that the endianness of the struct contents matters as well, so you may also have a mismatch with the x86 there too.
In this situation I think the only sure-fire way to cope is to use unsigned char[42] in the parts where it matters. Start by writing a precise specification of exactly what fields are where in this 42-byte block, and what endian, then use that definition to write some code to translate between that and a struct you can interact with. The code will likely be either all-at-once serialisation code (aka marshalling), or a bunch of getters and setters.
This is one reason why reading whole structs instead of memberwise fails, and should be avoided.
In this case, packing plus aligning at 4 means there will be two bytes of padding. This happens because the size must be compatible for storing the type in an array with all items still aligned at 4.
I imagine you have something like:
read(fd, &obj, sizeof obj)
Because you don't want to read those 2 padding bytes which belong to different data, you have to specify the size explicitly:
read(fd, &obj, 42)
Which you can keep maintainable:
typedef struct {
//...
enum { read_size = 42 };
} __attribute__((aligned(4),packed)) CHALLENGE;
// ...
read(fd, &obj, obj.read_size)
Or, if you can't use some features of C++ in your C:
typedef struct {
//...
} __attribute__((aligned(4),packed)) CHALLENGE;
enum { CHALLENGE_read_size = 42 };
// ...
read(fd, &obj, CHALLENGE_read_size)
At the next refactoring opportunity, I would strongly suggest you start reading each member individually, which can easily be encapsulated within a function.
I've been moving structures back and forth from Linux, Windows, Mac, C, Swift, Assembly, etc.
The problem is NOT that it can't be done, the problem is that you can't be lazy and must understand your tools.
I don't see why you can't use:
typedef struct
{
unsigned int code;
unsigned int length;
unsigned int seq;
unsigned int request;
unsigned char nonce[16];
unsigned short crc;
} __attribute__((packed)) CHALLENGE;
You can use it and it doesn't require any special or clever code. I write a LOT of code that communicates to ARM. Structures are what make things work. __attribute__ ((packed)) is my friend.
The odds of being in a "world of hurt" are nil if you understand what is going on with both.
Finally, I can't for the life make out how you get 42 or 44. Int is either 4 or 8 bytes (depending on the compiler). That puts the number at either 16+16+2=34 or 32+16+2=50 -- assuming it is truly packed.
As I say, knowing your tools is part of your problem.
What is your true goal?
If it's to deal with data that's in a file or on the wire in a particular format what you should do is write up some marshaling/serialization routines that move the data between the compiler struct that represents how you want to deal with the data inside the program and a char array that deals with how the data looks on the wire/file.
Then all that needs to be dealt with carefully and possibly have platform specific code is the marshaling routines. And you can write some nice-n-nasty unit tests to ensure that the marshaled data gets to and from the struct properly no matter what platform you might have to port to today and in the future.
I would guess that the problem is that 42 isn't divisible by 4, and so they get out of alignment if you put several of these structs back to back (e.g. allocate memory for several of them, determining the size with sizeof). Having the size as 44 forces the alignment in these cases as you requested. However, if the internal offset of each struct member remains the same, you can treat the 44 byte struct as though it was 42 bytes (as long as you take care to align any following data at the correct boundary).
One trick to try might be putting both of these structs inside a single union type and only use 42-byte version from within each such union.
As I am using linux, I have found that by echo 3 > /proc/cpu/alignment it will issue me with a warning, and fix the alignment issue. This is a work around but it is very helpful with locating where the structures are failing to be misaligned.