Structure padding with zero length array - c

What is the best practice of using variable length array inside a structure?
Say
typedef struct foo_s {
uint32_t data_type;
uint16_t data_len;
uint8_t data[];
} foo_t;
On a x86_64 machine with Gcc 4.8, i got
sizeof(foo_t) == 8, but
offsetof(foo_t, data) == 6
looks like there is a difference there, no padding after data_len, but there is padding for the structure.
Should I keep always keep the largest member last to avoid this? i.e.
typedef struct foo_s {
uint16_t data_len;
uint32_t data_type;
uint8_t data[];
} foo_t;
What's the best practice for using var len array?

Unless you have a particular reason for wanting data to be 4-byte aligned (and if so, why is it a uint8?), the first one is mildly preferable because it'll save you a couple of bytes. For a variable-length structure like this, the value reported by sizeof is not really relevant, for exactly this reason. If you decide to allocate sizeof(foo_t) + data_len bytes for it then you'll be wasting a couple of bytes, but you'd waste them in padding in your second structure definition anyway.

If you want to pack your structs without sacrificing alignment then yes: the best option is to order elements in decreasing or increasing order. The array must be the last element, so here your best option is decreasing order (note that the size win is small, and it only matters if you have a large array of structs, but with a flexible array member you can't have an array of structs).

Related

How to get aligned array of packed structs in GCC C?

In GCC C, how can I get an array of packed structs, where each entry in the array is aligned?
(FWIW, this is on a PIC32, using the MIPS4000 architecture).
I have this (simplified):
typedef struct __attribute__((packed))
{
uint8_t frameLength;
unsigned frameType :3;
uint8_t payloadBytes;
uint8_t payload[RADIO_RX_PAYLOAD_WORST];
} RADIO_PACKET;
RADIO_PACKETs are packed internally.
Then I have RADIO_PACKET_QUEUE, which is a queue of RADIO_PACKETs:
typedef struct
{
short read; // next buffer to read
short write; // next buffer to write
short count; // number of packets stored in q
short buffers; // number of buffers allocated in q
RADIO_PACKET q[RADIO_RX_PACKET_BUFFERS];
} RADIO_PACKET_QUEUE;
I want each RADIO_PACKET in the array q[] to start at an aligned address (modulo 4 address).
But right now GCC doesn't align them, so I get address exceptions when trying to read q[n] as a word. For example, this gives an exception:
RADIO_PACKET_QUEUE rpq;
int foo = *(int*) &(rpq.q[1]);
Perhaps this is because of the way I declared RADIO_PACKET as packed.
I want each RADIO_PACKET to remain packed internally, but want GCC to add padding as needed after each array element so each RADIO_PACKET starts at an aligned address.
How can I do this?
Since you specify that you are using GCC, you should be looking at type attributes. In particular, if you want RADIO_PACKETs to be aligned on 4-byte (or wider) boundaries, then you would use __attribute__((aligned (4))) on the type. When applied to a struct, it describes the alignment of instances of the overall struct, not (directly) the alignment of any individual members, so it is possible to use it together with attribute packed:
typedef struct __attribute__((aligned(4), packed))
{
uint8_t frameLength;
unsigned frameType :3;
uint8_t payloadBytes;
uint8_t payload[RADIO_RX_PAYLOAD_WORST];
} RADIO_PACKET;
The packed attribute prevents padding between structure elements, but it does not prevent trailing padding in the structure representation, exactly as is necessary for ensuring the required alignment for every element of an array of the specified type. You should not then need to do anything special in the declaration of RADIO_PACKET_QUEUE.
This is a lot cleaner and clearer than the alternative you came up with, but it is GCC-specific. Since you were GCC-specific already, I don't see that being a problem.
You can wrap your packed structure, inside another unaligned structure. Then you do array from this unaligned structures.
Solution 2 could be, to add dummy member char[] at the end of the packed structure. In this case you will need to calculate it somehow, probably manually.
I also will suggest you to rearrange your structure, by placing longer members first and placing uint8_t members last (assuming you have 16/32 bit members and not doing some HW mapping).
I think I've solved this, based on the hint from #Nick in the question comments.
I added a RADIO_PACKET_ALIGNED wrapper around RADIO_PACKET. This includes calculated padding.
Then I substituted RADIO_PACKET_ALIGNED for RADIO_PACKET in the RADIO_PACKET_QUEUE structure.
Seems to work:
typedef struct
{
RADIO_PACKET packet;
uint8_t padding[3 - (sizeof(RADIO_PACKET) + 3) % 4];
} RADIO_PACKET_ALIGNED;
typedef struct
{
short read; // next buffer to read
short write; // next buffer to write
short count; // number of packets stored in q
short buffers; // number of buffers allocated in q
RADIO_PACKET_ALIGNED q[RADIO_RX_PACKET_BUFFERS];
} RADIO_PACKET_QUEUE;
Thanks to all the commenters!
Edit: A more portable version of the wrapper would use:
uint8_t padding[(sizeof(int) - 1) - (sizeof(RADIO_PACKET) + (sizeof(int) - 1)) % sizeof(int)];

What is the significance of union in this code, what is the disadvantage if structure?

struct queue_entry_s {
odp_buffer_hdr_t *head;
odp_buffer_hdr_t *tail;
int status;
enq_func_t enqueue ODP_ALIGNED_CACHE;
deq_func_t dequeue;
enq_multi_func_t enqueue_multi;
deq_multi_func_t dequeue_multi;
odp_queue_t handle;
odp_buffer_t sched_buf;
odp_queue_type_t type;
odp_queue_param_t param;
odp_pktio_t pktin;
odp_pktio_t pktout;
char name[ODP_QUEUE_NAME_LEN];
};
typedef union queue_entry_u {
struct queue_entry_s s;
uint8_t pad[ODP_CACHE_LINE_SIZE_ROUNDUP(sizeof(struct queue_entry_s))];
} queue_entry_t;
typedef struct queue_table_t {
queue_entry_t queue[ODP_CONFIG_QUEUES];
} queue_table_t;
static queue_table_t *queue_tbl;
#define ODP_CACHE_LINE_SIZE 64
#define ODP_ALIGN_ROUNDUP(x, align)\
((align) * (((x) + align - 1) / (align)))
#define ODP_CACHE_LINE_SIZE_ROUNDUP(x)\
ODP_ALIGN_ROUNDUP(x, ODP_CACHE_LINE_SIZE)
In the above code, typedef union queue_entry_u, What is the significance of the union. If we take structure(typedef struct queue_entry_u), Is there any disadvantage?
unions have several usages:
union saves some memory. It makes it so that s and pad sit in the same place in memory. It is useful if you know that only one of them is needed then you can use a union.
It is also useful to be able to iterate over the fields in your struct. By saving the fields in a union you have both an array and a struct so if you iterate over pad you are in essence iterating over the bytes of s.
unions are also useful in general for casting. The syntax is a little prettier to serialize your entry into a byte array by just using the union.
In this case it looks like the use of a union is to pad the size of s to fit in a cache line. This way if the size of a queue_entry_s is an exact multiple of the length of a cache line s then pad will sit in exactly the same memory and not waste space. Otherwise pad will take more memory than s and the size of the union will always be an exact multiple of the length of a cache line.
This being said it is usually only a good idea to use unions if you are writing embedded code for devices very low on memory or with very stringent performance requirements. They are very dangerous and very easy to misuse by accidentally writing over memory that was meant to represent the other type in the union.
Let's start with the definition of a union from K&R 2nd edition:
A union is a variable that may hold (at different times) objects of
different types [...]. Unions provide a way to manipulate different
kinds of data in a single area of storage.
The union in the question contains two objects: a structure of type struct queue_entry_s and a array of uint8_t. It's important to note that those two objects overlap in memory. Specifically, the address where the structure starts is the same as the address where the array starts. If you write to the structure, the contents of the array will be changed, and if you write to the array, then the contents of the structure will be changed.
Then note that the ODP_CACHE_LINE_SIZE_ROUNDUP macro takes a size and computes the smallest multiple of 64 that is greater than or equal to that size.
The size of the union is determined by the size of the largest member. So for example, if the sizeof(struct queue_entry_s) is 80, then the sizeof of the pad array will be 128, and the sizeof the union will be 128.
Which brings us finally to the answer. The purpose of the union is to increase the memory used by the structure, so that the structure always uses a multiple of 64 bytes of memory.
If you were to change typedef union queue_entry_u to typedef struct queue_entry_u, then the memory layout would be changed. Instead of a having s and pad overlapping in memory, the pad array would follow the s structure in memory. So if s occupies 80 bytes and pad occupies 128 bytes, then the typedef struct queue_entry_u would define an object that occupies 208 bytes of memory. That would be a waste of memory, and wouldn't comply with the multiple-of-64 requirement.

Various length structure in C for memory manager?

I practice in realization a memory manager in C.
I want the structure, that has a various length and self-described.
So, I peep at a POSIX textbook something, like that:
struct layout
{
uint32_t size; // array size in bytes, include space after the struct
uchar_t data[1];
};
// But, is next line correct?
layout *val = malloc (array_memory_in_bytes + sizeof (uint32_t) - 1);
// Where does a static array keep the pointer for using it?
If I have several these structures one-after-one in uninterrupted piece of memory, and I want be able to iterate through them. Can I write something, like that:
layout *val1 = pointer;
layout *val2 = val1 + val1.size + sizeof (val1.size);
Or can you recommend me a better approach?
The Standard C version of this is called flexible array member and it looks like:
struct layout
{
uint32_t size;
uchar_t data[];
};
// allocate one of these blocks (in a function)
struct layout *val = malloc( sizeof *val + number_of_bytes );
val->size = number_of_bytes;
The code val1->data + val1->size will get you a pointer one-past-the-end of the space you just malloc'd.
However you cannot iterate off the end of one malloc'd block and hope to hit another malloc'd block. To implement this idea you would have to malloc a large block and then place various struct layout objects throughout it, being careful about alignment.
In this approach, it's probably best to also store an index of where each struct layout is. In theory you could go through the list from the start each time, adding on size and then doing your alignment adjustment; but that would be slow and also it would mean you could not cope with a block in the middle being freed and re-"allocated".
If this is meant to be a drop-in replacement for malloc then there are in fact two alignment considerations:
alignment for struct layout
data must be aligned for any possible type
The simplest way to cope with this is to align struct layout for any possible type also. This could look like (note: #include <stdint.h> required):
struct layout
{
uint64_t size; // may as well use 64 bits since they're there
_Alignas(max_align_t) uchar_t data[];
};
An alternative might be to keep size at 32-bit and throw in a pragma pack to prevent padding; then you'll need to use some extra complexity to make sure that the struct layout is placed 4 bytes before a max_align_t-byte boundary, and so on. I'd suggest doing it the easy way first and get your code running; then later you can go back and try this change in order to save a few bytes of memory if you want.
Alternative approaches:
Keep each instance of a struct layout plus its trailing data in a separate allocation.
Change data be a pointer to malloc'd space; then you could keep all of the struct layout objects in an array.
The general idea will work, but that specific struct will only work if the most-severe boundary alignment case is an int.
A memory manager, particularly one that might be a back-end for an implementation of malloc(), must know what that worst-case boundary is. The actual start of data must be on that boundary in order to satisfy the general requirement that the allocated memory be suitably aligned for the storage of any data type.
The easiest way to get that done is to make the length allocation header described by the layout struct and the actual allocation sizes all multiples of that alignment unit.
No matter what, you can't describe the start of data as a struct member and have the size of that struct be the size of the header. C doesn't support zero-length fields. You should use something to put that array on boundary, and use the offsetof() macro from <stddef.h>.
Personally, I'd use a union, based on both old habits and occasional use of Visual C++ for C. But uint32_t is a C99 type and if you also have C11 support you can use _Alignas(). With that, your struct could look something like:
#define ALIGN_TYPE double /* if this is the worst-case type */
#define ALIGN_UNIT ((sizeof)(ALIGN_TYPE))
#define ALIGN_SIZE(n) (((size_t)(n) + ALIGN_UNIT - 1) & ~(ALIGN_UNIT-1))
typedef struct layout
{
size_t size; /* or use uint32_t if you prefer */
_Alignas(ALIGN_UNIT) char data[1];
} layout;
#define HEADER_SIZE (offsetof(layout, data))
That makes most everything symbolic except for the worst-case alignment type. You'd allocate the combined header plus data array with:
layout *ptr = (layout*) malloc(HEADER_SIZE + ALIGN_SIZE(number_of_bytes));
ptr->size = HEADER_SIZE;
The ALIGN_SIZE type really isn't a symbolic constant, though, unless C99/C11 changed the definition of sizeof. You can't use to compute ordinary array dimensions, for example. You can hard code a literal number, like 8 for a typical double, if that's a problem. Beware that long double has a problematical size (10 bytes) on many x86 implementations. If you're going to base the allocation unit on a type, then long double might not be your best choice.

C Are arrays in structs aligned?

struct {
uint8_t foo;
uint8_t bar;
uint8_t baz;
uint8_t foos[252];
uint8_t somethingOrOther;
} A;
struct {
uint8_t foo;
uint8_t bar;
uint8_t baz;
uint8_t somethingOrOther;
uint8_t foos[252];
} B;
Does it matter that I've put foos on byte 3 in the first example, vs on byte 4 in B?
Does an array in C have to start aligned?
Is the size of this struct 256 bytes exactly?
Given that the data type is uint8_t (equivalent to unsigned char), there is no need for padding in the structure, regardless of how it is ordered. So, you can reasonably assume in this case that every compiler will make that structure into 256 bytes, regardless of the order of the elements.
If there were data elements of different sizes, then you might well get padding added and the size of the structure might vary depending on the order of the elements.
As the good book says (C11 section 6.7.2.1 paragraph 14):
Each non-bit-field member of a structure or union object is aligned in
an implementation- defined manner appropriate to its type... There
may be unnamed padding within a structure object
You didn't "put foos on byte 3" - apart from the fact that the first element is always on byte 0, you have no real control over what byte an element will be put on. The compiler can give each field a full machine word or more if it thinks that will provide the most efficient accesses. If so, it will allocate "padding" bytes in between - unused space.
Note that this only applies to structs; arrays themselves do not add any padding bytes (if they did, the pointer arithmetic/indexing rule wouldn't work), so you can be sure that the size of foos itself is exactly the declared size, and know the exact alignment of each numbered element.
No, it doesn't matter.
Pretty much every compiler out there will align struct fields to a natural boundary appropriate for the target architecture, by inserting hidden "padding bytes" between fields where necessary.
Specifically, since you're using only uint8_t, there will be no padding bytes inserted - every field already falls on a naturally-aligned boundary. (Everything is a multiple of one.)
Both of the structs you've shown are exactly 256 bytes in size. You can confirm this like so:
int main(void)
{
printf("sizeof(struct A)=%zu \n", sizeof(struct A));
printf("sizeof(struct B)=%zu \n", sizeof(struct B));
}
You can prevent this padding from being added by "packing" the structure:
Microsoft Visual C uses #pragma pack(1)
GCC uses __attribute__((packed))
Do note that doing this on a structure with unaligned members can have a serious impact on the performance of your program.

structure with bitfields, memory allocation for array

I want to use a structure as given below on my ARM Cortex M0. I am using C programming.
struct path_table_t {
uint8_t lut_index;
uint8_t flag : 1;
};
flag field is made to be single bit by bit fields. How will be the memory allocation for the array of above mentioned structure will happen?
Will I get the benefit of bit fields as saving in total memory?
Most likely the sizeof(struct path_table_t) == 2. That is, structure with 2 elements, with each being 8 bits. 7 bits of flag should remain unused, but are still reserved.
Note that C standard gives a lot of room for compiler implementation to vary the size required by the bit fields. For example (N1570, 6.7.2.1 p11):
An implementation may allocate any addressable storage unit large enough to hold a bit-
field.
and
The alignment of the addressable storage unit is unspecified.
So the space being reserved by the flag and its padding might be even larger.
If you wish to be sure, check the size with sizeof. Just remember that changing compiler or compiler settings might affect the result.
To answer your question Will I get the benefit of bit fields as saving in total memory? NO
consider two struct's
struct path_table1_t {
uint8_t lut_index;
uint8_t flag : 1;
}a;
and
struct path_table2_t {
uint8_t lut_index;
uint8_t flag;
}b;
sizeof(a) and sizeof(b) is equal.
But incase you are planning to use all the bits in the given bit field then the memory allocated is reduced. Like
struct path_table3_t {
uint8_t lut_index : 1;
uint8_t flag : 1;
}c;
In this case the sizeof(c) is the sizeof(uint8_t) which will be 1 byte.
To summarize:
sizeof(c) < sizeof(a) and sizeof(a) = sizeof(b)
If you are very much interested to save memory then use the keyword __attribute__((__packed__)) , where this directive would make compiler to squeeze in the memory allocated for this struct upto which the user is using.
Memory allocation for bit fields is implementation defined.
Let's consider your example:
struct path_table_t {
uint8_t lut_index;
uint8_t flag : 1;
};
int main()
{
struct path_table_t p;
printf("%d\n",sizeof(p));
}
Here you want totally 9 bits of usage so it will be padded to 2 bytes.
If your structure was like
struct path_table_t {
uint8_t lut_index : 1;
uint8_t flag : 1;
};
int main()
{
struct path_table_t p;
printf("%d\n",sizeof(p));
}
You see here only 2 bits is required so your sizeof(struct) would be 1 byte which answers your question.
Yes using bit fields can save memory as shown above.

Resources