How to fix allocation size of a struct

How to fix allocation size of a struct - c

I have a struct
typedef struct
{
int A ;
int B ;
…
} SomeStruct ;
I have an instance of SomeStruct that I want to persist to Flash memory which has a sector size of 512 bytes. What GCC attribute can I apply to that instance such that the allocation size is a multiple of 512 bytes?
The only options I can think of are :-
1) Pad the struct to make it exactly 512 bytes. This requires recalculation every time a field is added to the struct. No compiler warnings when I get it wrong. Also struct is larger than needed with normal initialisation, copying etc.
2) Place the variable in a separate Linker section. This provides full protection and warnings, but gets a bit tedious if multiple variables are used.
3) Make a union of the struct and a 512 byte array. Copes with adding extra fields until the struct is greater than 512 bytes, then fails without any warnings.

Referring 1:
#include <assert.h>
#define FLASH_BYTES (512)
#pragma pack(1)
struct flash
{
struct data
{
int i;
char c;
...
};
char pads[FLASH_BYTES - sizeof (struct data)];
};
#pragma pack()
int main(void)
{
assert(sizeof (struct flash) == FLASH_BYTES);
...
The assert might even not be necessary because if the result
FLASH_BYTES - sizeof (struct data)
is negative any GCC should issue an error. To make sure it will be negative cast the result of the sizeof operation to any signed integer, like for example so:
FLASH_BYTES - (int) sizeof (struct data)
So trying to compile this
#pragma pack(1)
struct flash
{
struct data
{
int i;
char c[FLASH_BYTES];
};
char pads[FLASH_BYTES - (int) (sizeof (struct data))];
};
#pragma pack()
int main(void)
{
}
You should be giving you something like:
main.c:14:12: error: size of array ‘pads’ is negative
char pads[FLASH_BYTES - (int) sizeof (struct data)];

A portable solution is to define a union of SomeStruct, with a char array whose size is calculated to meet the necessary alignment.
typedef struct
{
int A;
int B;
char c[512];
} SomeStruct;
#define STORAGE_ALIGNMENT 512
typedef union
{
SomeStruct data;
char paddedData[((sizeof(SomeStruct) + STORAGE_ALIGNMENT - 1) / STORAGE_ALIGNMENT) * STORAGE_ALIGNMENT];
} SomeStructAligned;
Online running version (Coliru) here
The sizing formula is well known and works for any integer. Since this is a power-of-2 you could also simplify it down to the form (sizeof(SomeStruct) + (STORAGE_ALIGNMENT - 1)) & ~(STORAGE_ALIGNMENT - 1)) == (sizeof(SomeStruct) + 0x1ff) & ~0x1ff). In practice you may need ~size_t(0x1ff) on the rightmost term to ensure portability to 64-bit machines; since 0x1ff is an int (32-bit), ~0x1ff results in a 64-bit 0x00000000fffffe00 value instead of the desired 0xFFFFFFFFfffffe00 mask.
Sub-Optimal Approach
An alternative approach could have been to define a wrapper struct containing your original data plus some automatically calculated padding.
typedef struct
{
int A;
int B;
} SomeStruct;
#define STORAGE_ALIGNMENT 512
typedef struct
{
SomeStruct data;
char padding[(STORAGE_ALIGNMENT) - (sizeof(SomeStruct) % STORAGE_ALIGNMENT)];
} SomeStructAligned;
Online running version (Coliru) here.
However, the above is not perfect: if sizeof(SomeStruct) is a multiple of 512, then sizeof(padding) will be 512, wasting a quantum of storage. Whereas the union never wastes space.

You can try something like this (though it is a dirty bit trick)
#define ROUND_UP_512(x) ((x) + 511 & ~511)
struct myStruct {
// put whatever
};
union myUnion{
myStruct s;
char ensureSize[ROUND_UP_512(sizeof(myStruct))];
};
in this case the size of "myUnion" is guaranteed to be a multiple of 512 that is greater than or equal to the size of "myStruct"

Related

Pad a packed struct in C for 32-bit alignment

I have a struct defined that is used for messages sent across two different interfaces. One of them requires 32-bit alignment, but I need to minimize the space they take. Essentially I'm trying to byte-pack the structs, i.e. #pragma pack(1) but ensure that the resulting struct is a multiple of 32-bits long. I'm using a gcc arm cross-compiler for a 32-bit M3 processor. What I think I want to do is something like this:
#pragma pack(1)
typedef struct my_type_t
{
uint32_t someVal;
uint8_t anotherVal;
uint8_t reserved[<??>];
}
#pragma pack()
where <??> ensures that the size of my_type_t is divisible by 4 bytes, but without hard-coding the padding size. I can do something like this:
#pragma pack(1)
typedef struct wrapper_t
{
my_type_t m;
uint8_t reserved[sizeof(my_type_t) + 4 - (sizeof(my_type_t) % 4)]
}
#pragma pack()
but I'd like to avoid that.
Ultimately what I need to do is copy this to a buffer that is 32-bit addressable, like:
static my_type_t t; //If it makes a difference, this will be declared statically in my C source file
...
memcpy(bufferPtr, (uint32_t*)&t, sizeof(t)) //or however I should do this
I've looked at the __attribute__((align(N))) attribute, which gives me the 32-bit aligned memory address for the struct, but it does not byte-pack it. I am confused about how (or if) this can be combined with pack(1).
My question is this:
What is the right way to declare these structs so that I can minimize their footprint in memory but that allows me to copy/set it in 4-byte increments with a unsigned 32-bit pointer? (There are a bunch of these types of arbitrary size and content). If my approach above of combining pack and padding is going about this totally wrong, I'll happily take alternatives.
Edit:
Some constraints: I do not have control over one of the interfaces. It is expecting byte-packed frames. The other side is 32-bit addressable memory mapped registers. I have 64k of memory for the entire executable, and I'm limited on the libraries etc. I can bring in. There is already a great deal of space optimization I've had to do.
The struct in this question was just to explain my question. I have numerous messages of varying content that this applies to.

I can't speak for the specific compiler and architecture you are using, but I would expect the following to be sufficient:
typedef struct {
uint32_t x;
uint8_t y;
} my_type_t;
The structure normally has the same alignment as its largest field, and that includes adding the necessary padding at the end.
my_type_t
+---------------+
| x |
+---+-----------+
| y | [padding] |
+---+-----------+
|<-- 32 bits -->|
Demo
This is done so the fields are properly aligned when you have an array of them.
my_type_t my_array[2];
my_array[1].x = 123; // Needs to be properly aligned.
The above assumes you have control over the order of the fields to get the best space efficiency, because it relies on the compiler aligning the individual fields. But those assumptions can be removed using GCC attributes.
typedef struct {
uint8_t x;
uint32_t y;
uint8_t z;
}
__attribute__((packed)) // Remove interfield padding.
__attribute__((aligned(4))) // Set alignment and add tail padding.
my_type_t;
This produces this:
my_type_t
+---+-----------+
| x | y
+---+---+-------+
| z | [pad] |
+---+---+-------+
|<-- 32 bits -->|
Demo
The packed attribute prevents padding from being added between fields, but aligning the structure to a 32-bit boundary forces the alignment you desire. This has the side effect of adding trailing padding so you can safely have an array of these structures.

As you use gcc you need to use one of the attributes.
Example + demo.
#define PACKED __attribute__((packed))
#define ALIGN(n) __attribute__((aligned(n)))
typedef struct
{
uint8_t anotherVal;
uint32_t someVal;
}PACKED my_type_t;
my_type_t t = {1, 5};
ALIGN(64) my_type_t t1 = {1, 5};
ALIGN(512) my_type_t t2 = {2, 6};
int main()
{
printf("%p, %p, %p", (void *)&t, (void *)&t1, (void *)&t2);
}
Result:
0x404400, 0x404440, 0x404600
https://godbolt.org/z/j9YjqzEYW

I suggest combining #pragma pack with alignas:
#include <stdalign.h>
#include <stdint.h>
typedef struct {
#pragma pack(1)
alignas(4) struct { // requires 2+1+2 bytes but is aligned to even 4:s
uint16_t someVal; // +0
uint8_t anotherVal; // +2
uint16_t foo; // +3 (would be 4 without packing)
};
#pragma pack()
} my_type_t;
The anonymous inside struct makes access easy as before:
int main() {
my_type_t y;
y.someVal = 10;
y.anotherVal = 'a';
y.foo = 20;
printf("%zu\n", (char*)&y.someVal - (char*)&y.someVal); // 0
printf("%zu\n", (char*)&y.anotherVal - (char*)&y.someVal); // 2
printf("%zu\n", (char*)&y.foo - (char*)&y.someVal); // 3
my_type_t x[2];
printf("%zu\n", (char*)&x[1] - (char*)&x[0]); // 8 bytes diff
}
If you'd like to be able to take the sizeof the actual data carrying part of my_type_t (to send it), you could name the inner struct (which makes accessing the fields a little more cumbersome):
#pragma pack(1)
typedef struct {
uint16_t someVal;
uint8_t anotherVal;
uint16_t foo;
} inner;
#pragma pack()
typedef struct {
alignas(4) inner i;
} my_type_t;
You'd now have to mention i to access the fields, but it has the benefit that you can take sizeof and get 5 (in this example):
int main() {
my_type_t y;
printf("%zu %zu\n", sizeof y, alignof(y)); // 8 4
printf("%zu\n", sizeof y.i); // 5 (the actual data)
}

To form a structure type that is aligned one must put the alignment attribute to the first member of the struct. It can be combined with the packed attribute.
typedef struct {
_Alignas(4) uint8_t anotherVal;
uint32_t someVal;
} __attribute__((packed)) my_type_t;
Exemplary usage with alignment exaggerated to 64 bytes.
#include <stdint.h>
#include <stdio.h>
typedef struct {
_Alignas(64) uint8_t anotherVal;
uint32_t someVal;
} __attribute__((packed)) my_type_t;
int main() {
my_type_t a, b;
printf("%zu %p\n", sizeof a, (void*)&a);
printf("%zu %p\n", sizeof b, (void*)&b);
}
prints:
64 0x7ffff26caf80
64 0x7ffff26cafc0

C dummy struct, strict aliasing and static initialization

My first question wasn't well formulated so here goes again, this time, more well asked and explained.
I want to hide the variables of a struct while being able to initialize the struct statically on the stack. Most solutions out there use the opaque pointer idiom and dynamic memory allocation which isn't always desired.
The idea for this example came from the following post:
https://www.reddit.com/r/C_Programming/comments/aimgei/opaque_types_and_static_allocation/
I know that this is probably ub but I believe it should work fine in most consumers archictures: either 32 bit or 64 bit.
Now you may tell me that sometimes size_t may be bigger than void * and that the void * alignment in the union forcing the union alignment to be that of sizeof(void *) may be wrong, but usually that's never case, maybe it can happen but I see it as the exception not the rule.
Based on the fact that most compilers add padding to align it to either a multiple of 4 or 8 depending on your architecture and that sizeof returns the correct size with padding, sizeof(Vector) and sizeof(RealVector) should be the same, and based on the fact that both Vector and RealVector have the same alignment it should be fine too.
If this is ub, how can I create a sort of scratchpad structure in C in a safe maner? In C++ we have alignas, alignof and placement new which hepls making this ordeal a lot more safer.
If that's not possible to do in C99, will it be more safer in C11 with alignas and alignof?
#include <stdint.h>
#include <stdio.h>
/* In .h */
typedef union Vector {
uint8_t data[sizeof(void *) + 2 * sizeof(size_t)];
/* this is here to the force the alignment of the union to that of sizeof(void *) */
void * alignment;
} Vector;
void vector_initialize_version_a(Vector *);
void vector_initialize_version_b(Vector *);
void vector_debug(Vector const *);
/* In .c */
typedef struct RealVector {
uint64_t * data;
size_t length;
size_t capacity;
} RealVector;
void
vector_initialize_version_a(Vector * const t) {
RealVector * const v = (RealVector *)t;
v->data = NULL;
v->length = 0;
v->capacity = 8;
}
void
vector_initialize_version_b(Vector * const t) {
*(RealVector *)t = (RealVector) {
.data = NULL,
.length = 0,
.capacity = 16,
};
}
void
vector_debug(Vector const * const t) {
RealVector * v = (RealVector *)t;
printf("Length: %zu\n", v->length);
printf("Capacity: %zu\n", v->capacity);
}
/* In main.c */
int
main() {
/*
Compiled with:
clang -std=c99 -O3 -Wall -Werror -Wextra -Wpedantic test.c -o main.exe
*/
printf("%zu == %zu\n", sizeof(Vector), sizeof(RealVector));
Vector vector;
vector_initialize_version_a(&vector);
vector_debug(&vector);
vector_initialize_version_b(&vector);
vector_debug(&vector);
return 0;
}

I'll post my answer from the previous question, which I didn't have to time to post :)
Am I safe doing this?
No, you are not. But instead of finding a way of doing it safe, just error when it's not safe:
#include <assert.h>
#include <stdalign.h>
static_assert(sizeof(Vector) == sizeof(RealVector), "");
static_assert(alignof(Vector) == alignof(RealVector), "");
With checks written in that way, you will know beforehand when there's going to be a problem, and you can then fix it handling the specific environment. And if the checks will not fire, you will know it's fine.
how can I create a sort of scratchpad structure in C in a safe maner?
The only correct way of really doing it safe would be a two step process:
first compile a test executable that would output the size and alignment of struct RealVector
then generate the header file with proper structure definition struct Vector { alignas(REAL_VECTOR_ALIGNMENT) unigned char data[REAL_VECTOR_SIZE]; };
and then continue to compiling the final executable
Compilation of test and final executables has to be done using the same compiler options, version and settings and environment.
Notes:
Instead of union use struct with alignof
uint8_t is an integer with 8-bits. Use char, or best unsigned char, to represent "byte".
sizeof(void*) is not guaranteed to be sizeof(uint64_t*)
where max alignment is either 4 or 8 - typically on x86_64 alignof(long double) is 16.

Why nor simple? It avoids the pointer punning
typedef struct RealVector {
uint64_t * data;
size_t length;
size_t capacity;
} RealVector;
typedef struct Vector {
uint8_t data[sizeof(RealVector)];
} Vector;
typedef union
{
Vector v;
RealVector rv;
} RealVector_union;
void vector_initialize_version_a(void * const t) {
RealVector_union * const v = t;
v -> rv.data = NULL;
v -> rv.length = 0;
v -> rv.capacity = 8;
}
And

One possibility is to define Vector as follows in the .h file:
/* In vector.h file */
struct RealVector {
uint64_t * data;
size_t length;
size_t capacity;
};
typedef union Vector {
char data[sizeof(struct RealVector)];
/* these are here to the force the alignment of the union */
uint64_t * alignment1_;
size_t alignment2_;
} Vector;
That also defines struct RealVector for use in the vector implementation .c file:
/* In vector.c file */
typedef struct RealVector RealVector;
This has the advantage that the binary contents of Vector actually consists of a RealVector and is correctly aligned. The disadvantage is that a sneaky user could easily manipulate the contents of a Vector via pointer type casting.
A not so legitimate alternative is to remove struct RealVector from the .h file and replace it with an anonymous struct type of the same shape:
/* In vector.h file */
typedef union Vector {
char data[sizeof(struct { uint64_t * a; size_t b; size_t c; })];
/* these are here to the force the alignment of the union */
uint64_t * alignment1_;
size_t alignment2_;
} Vector;
Then struct RealVector needs to be fully defined in the vector implementation .c file:
/* In vector.c file */
typedef struct RealVector {
uint64_t * data;
size_t length;
size_t capacity;
} RealVector;
This has the advantage that a sneaky user cannot easily manipulate the contents of a Vector without first defining another struct type of the same shape as the anonymous struct type. The disadvantage is that the anonymous struct type that forms the binary representation of Vector is not technically compatible with the RealVector type used in the vector implementation .c file because the tags and member names are different.

Why can't we access bits that we pad in structures?

My question is we do padding to align structures.
typedef struct structb_tag
{
char c;
int i;
} structb_t;
Here we use 8 bytes. Why can't we use the 3 bytes that lot?

Why cant we use the 3 bytes
You could.
To do so measure the size your implementation allocates for a struct and then make it a union adding a char-array of exactly the size measured and there you go.
Assuming this
typedef struct structb_tag
{
char c;
int i;
} structb_t;
is created using eight bytes, that is sizeof (structb_t) evaluates to 8, change it to the following
typedef union unionb_tag
{
char bytes[8];
struct
{
char c;
int i;
} structb;
}
More reliable, in terms of portability and also robustness, would be this:
typedef union unionb_tag
{
struct structb_tag
{
char c;
int i;
} structb;
char bytes[sizeof (struct structb_tag)];
}

If you are using GCC and space is the most important thing for you instead of speed (which padding provides) you could just request the compiler to not do the padding, struct __attribute__((__packed__)) mystruct, padding is the way for compiler to align structures in the natural for faster access.

You can always take the pointer to the structure convert it to a byte pointer and access any byte of that structure.This is dangerous way, though.

The padding are implementation dependent, which are not defined by the standard, you cannot not use them as there is no way to reference the padding bytes.

Yes, you can.
typedef struct structb_tag
{
char c;
char pad[3];
int i;
} structb_t;
structb_t test;
test.pad[0] = 'a';

In short, we can use that three bytes.
The reason why we need 3 bytes padding is for memory usage optimization, so the compiler will help us add an gap between c and i. So when you use
typedef struct structb_tag
{
char c;
int i;
} structb_t;
It actually
typedef struct structb_tag
{
char c;
char[3] unseen_members;
int i;
} structb_t;
Accessing these unseen members will not cause any segmentation fault. In the point of view of OS, there's no difference between accessing members declared by programmers explicitly and declared by compiler implicitly.
#include <stdio.h>
#include <string.h>
typedef struct Test {
char c;
int i;
} MyTest;
int main() {
MyTest my_test;
memset(&my_test, 0, sizeof(my_test));
my_test.c = 1;
int* int_ptr = (int *)&my_test.c;
printf("Size of my_test is %lu\n", sizeof(my_test));
printf("Value of my_test.c(char) is %d\n", my_test.c);
printf("Value of my_test.c(int) is %d\n", *int_ptr);
return 0;
}
This gives:
Size of my_test is 8
Value of my_test.c(char) is 1
Value of my_test.c(int) is 1

We can access any byte in the structure using a pointer to that structure and typecasting that pointer to (char *) if we accordingly increment that pointer then we can access any byte but this not good programming skill. Structures are padded with some extra bytes so that an execution of the program can become faster.

Generate padding bytes for a structure by nesting it in a union

I was going through this question to reaffirm my understanding of structure padding.I have a doubt now. When I do something like this:
#include <stdio.h>
#define ALIGNTHIS 16 //16,Not necessarily
union myunion
{
struct mystruct
{
char a;
int b;
} myst;
char DS4Alignment[ALIGNTHIS];
};
//Main Routine
int main(void)
{
union myunion WannaPad;
printf("Union's size: %d\n\
Struct's size: %d\n", sizeof(WannaPad),
sizeof(WannaPad.myst));
return 0;
}
Output:
Union's size: 16
Struct's size: 8
should I not expect the struct to have been padded by 8 bytes? If I explicitly pad eight bytes to the structure, the whole purpose of nesting it inside an union like this is nullified.
I feel that declaring a union containing a struct and a character array the size of the struct ought to be but isn't, makes way for a neater code.
Is there a work-around for making it work as I would like it?

Think about it logically.
Imagine I had a union with some basic types in it:
union my_union{
int i;
long l;
double d;
float f;
};
would you expect sizeof(int) == sizeof(double)?
The inner types will always be their size, but the union will always be large enough to hold any of its inner types.

should I not expect the struct to have been padded by 8 bytes?
No, as the struct mystruct is seen/handled on its own. The char had been padded by 3 sizeof (int) -1 bytes to let the int be properly aligned. This does not change, even if someone, somewhere, sometimes decides to use this very struct mystruct inside another type.

By default struct is padded, so int b field is aligned on sizeof(int) boundary. There are several workarounds for this:
explicitly use fillers where needed: char a; char _a[sizeof(int)-1]; int b;
use compiler-dependent pragma to pack struct on byte boundary
use command-line switch etc.

Cast a struct of ints to an array of ints

I am using a library that has a function that takes an array of structs. That struct and function has the following layout:
struct TwoInt32s
{
int32_t a;
int32_t b;
};
void write(struct TwoInt32s *buffer, int len);
My initial tests suggest that an array of such structs has the same memory layout as an array of int32_t so I can do something like this:
int32_t *buffer = malloc(2 * len * sizeof(int32_t));
/* fill in the buffer */
write((struct TwoInt32s*)buffer, len);
However I'm wondering if this is universally true or not. Using an array of int32_t greatly simplifies my code.
EDIT: I forgot the sizeof
From what I read, C guarantees a few things about struct padding:
members will NOT be reordered
padding will only be added between members with different alignments or at the end of the struct
a pointer to a struct points to the same memory location as a pointer to its first member
each member is aligned in a manner appropriate for its type
there may be unnamed holes in the struct as necessary to achieve alignment
From this I can extrapolate that a and b have no padding between them. However it's possible that the struct will have padding at the end. I doubt this since it's word-aligned on both 32 and 64 bit systems. Does anyone have additional information on this?

The implementation is free to pad structs - there may be unused bytes in between a and b. It is guaranteed that the first member isn't offset from the beginning of the struct though.
Typically you manage such layout with a compiler-specific pragma, e.g:
#pragma pack(push)
#pragma pack(1)
struct TwoInt32s
{
int32_t a;
int32_t b;
};
#pragma pack(pop)

malloc allocates bytes. Why did you choose "2*len" ?
You could simply use "sizeof":
int32_t *buffer = malloc(len * sizeof(TwoInt32s));
/* fill in the buffer */
write((struct TwoInt32s*)buffer, len);
and as Erik mentioned, it would be a good practice to pack the struct.

It's safest to not cast, but convert -- i.e., create a new array and fill it with the values found in the struct, then kill the struct.

You could allocate structures but treat their members as a sort of virtual array:
struct TwoInt32s *buffer = malloc(len * sizeof *buffer);
#define BUFFER(i) (*((i)%2 ? &buffer[(i)/2].b : &buffer[(i)/2].a))
/* fill in the buffer, e.g. */
for (int i = 0; i < len * 2; i++)
BUFFER(i) = i;
Unfortunately, neither GCC nor Clang currently "get" this code.