I have a confusing behaviour with the memory alignment of structure elements. Consider these two structures:
typedef struct s_inner {
unsigned long ul1;
double dbl1;
fourth_struct s4;
unsigned long ul2;
int i1;
} t_inner;
typedef struct s_outer {
other_struct member1; /* 4-byte aligned, 40 bytes in terms of sizeof() */
unsigned long member2;
t_inner member3; /* see above */
} t_outer;
When I inspect the memory layout of t_outer, I could see that the elements of member1 are 4-byte aligned, just as I would expect it.
Also the memory layout of member3 is as expected: ul1 has 4 padding bytes attached so that dbl1 is aligned on an 8-byte border (normal on Win32).
However, when I inspect the memory layout of member2, I could see that this member has 4 padding bytes attached to it.
Could anyone explain why on earth member2 receives padding bytes? My expectation was that member2 does not carry a padding.
Edit 1:
See this memory dump. Before filling the structure elements, I've memset'd the whole t_outer structure with p's:
the red area is member1
the blue area is member2
the green area is member3
the yellow area marks the location of dbl1 within member3
Constraints
Compiler is VS2012
the actual stucture of other_struct should not matter here, it's a 40-byte sized 4-byte aligned structure
I do not want any workarounds for the behavior (reordering, packing, ...) but an explanation why this is happening.
so that dbl1 is aligned on an 8-byte border
Sure. But that alignment guarantee means bupkis if the structure itself is not aligned to 8 as well. Which is a guarantee that's normally provided by the compiler's address choices for the data section and the stack frame. Or the memory allocator. All guarantee at least alignment to 8.
But when you embed the structure inside s_outer then those 4 bytes of padding before member3 (not after member2) are required to get the alignment guarantee back.
Also note that a structure can have padding after the last member. Which may be required to ensure that members are still aligned when the structure is stored in an array. Same reason.
The VS2012 docs describe its padding behavior. In particular, they specify that the alignment requirement for a struct is the largest alignment requirement of any of its members. Member member3's type t_inner has a member of type double, with an 8-byte alignment requirement, therefore member3 overall has an 8-byte alignment requirement (and, also, t_outer has an 8-byte alignment requirement). Padding is required between member2 and member3 to obtain 8-byte alignment of member3.
Related
If there is a C structure like
struct abc
{
uint16_t port_no; //2 Bytes
uint8_t src_mac[6]; //6 Bytes
}
How would the compiler apply padding to align into 4 byte word on 32 bit sys:
Will it be
2 Bytes
Pad[2]
6 Bytes
Pad[2]
or
2 Byte
6 Byte
That depends on the architecture of the system.
On a 32-bit MIPS your structure would have no
padding whatsoever because uint16_t must be aligned
even addresses and uint8_t need not be aligned at all.
However a variable of type struct abc would be aligned at
even addresses thereby eventually forcing padding outside the
struct (say, when struct abc is a member of a surrounding struct).
In C/C++ every type has a size and an alignment that can differ from architecture to architecture, kernel or even just C/C++ ABI.
Gcc provides a function __alignof__ and C++11 has an alignof(type-id) function oficially that can be used to print the alignment requirement for a type. Plain C doesn't seem to have that but each type still has that property.
In memory every type has to be aligned to at least the alignment requirement and the compiler will insert padding into structs that ensure that is the case. The struct as a whole then has the largest alignment required for any of its members and a size that includes padding to a multiple of the alignment of the struct.
Generally an uint16_t has a 2 byte aligment requirement and chars a 1 byte alignment requirement. So th struct has a 2 byte alignment requirement and no padding is needed at all. Giving a total size of 8 byte for the struct.
But beware that the calling conventions for an architecture can require more alignment or padding for one reason or another.
Padding is on by default. It inserts "gaps" into your structure automatically. It comes out as though there were a fourth intervening variable, like this:
struct abc
{
uint16_t port_no; // 2 Bytes
char pad_0[2]; // 2 Bytes
uint8_t src_mac[6]; // 6 Bytes
char pad_1[2]; // 2 Bytes
}
I'm working with vectors and matrices right now and it was suggested to me that I should use SSE instead of using float arrays. However while reading the definition for the C intrinsics and the Assembly instructions it looks like there is a different version of some of the function where the vector has to be "16 byte aligned" and a slower version where the vector isn't aligned. What does having the vector be 16 byte aligned mean? How can I ensure that my vectors are 16 byte aligned?
Alignment ensures that objects are aligned on an address that is a multiple of some power of two. 16-byte-aligned means that the numeric value of the address is a multiple of 16. Alignment is important because CPUs are often less efficient or downright incapable of loading memory that doesn't have the required alignment.
Your ABI determines the natural alignment of types. In general, integer types and floating-point types are aligned to either their own size, or the size of the largest object of that kind that your CPU can treat at once, whichever is smaller. For instance, on 64-bit Intel machines, 32-bit integers are aligned on 4 bytes, 64-bit integers are aligned on 8 bytes, and 128-bit integers are also aligned on 8 bytes.
The alignment of structures and unions is the same as their most aligned field. This means that if your struct contains a field that has a 2-byte alignment and another field that has an 8-byte alignment, the structure will be aligned to 8 bytes.
In C++, you can use the alignof operator, just like the sizeof operator, to get the alignment of a type. In C, the same construct becomes available when you include <stdalign.h>; alternatively, you can use _Alignof without including anything.
AFAIK, there is no standard way to force alignment to be specific value in C or C++, but there are compiler-specific extensions to do it. On Clang and GCC, you can use the __attribute__((aligned(N))) attribute:
struct s_Stuff {
int var1;
short var2;
char padding[10];
} __attribute__((aligned(16)));
(Example.)
(This attribute is not to be confused with __attribute__((align(N))), which sets the alignment of a variable.)
Off the top of my head, I'm not sure for Visual Studio, but according to SoronelHaetir, that would be __declspec(align(N)). Not sure where it goes on the struct declaration.
In the context of vector instructions, alignment is important because people tend to create arrays of floating-point values and operate on them, instead of using types that are known to be aligned. However, __m128, __m256 and __m512 (and all of their variants, like _m128i and such) from <emmintrin.h>, if your compiler environment has it, are guaranteed to be aligned on the proper boundaries for use with aligned intrinsics.
Depending on your platform, malloc may or may not return memory that is aligned on the correct boundary for vector objects. aligned_alloc was introduced in C11 to address these issues, but not all platforms support it.
Apple: does not support aligned_alloc; malloc returns objects on the most exigent alignment that the platform supports;
Windows: does not support aligned_alloc; malloc returns objects aligned on the largest alignment that VC++ will naturally put an object on without an alignment specification; use _aligned_malloc for vector types
Linux: malloc returns objects aligned on an 8- or 16-byte boundary; use aligned_alloc.
In general, it's possible to request slightly more memory and perform alignment yourself with minimal penalties (aside that you're on your own to write a free-like function that will accept a pointer returned by this function):
void* aligned_malloc(size_t size, size_t alignment) {
intptr_t alignment_mask = alignment - 1;
void* memory = malloc(size + alignment_mask);
intptr_t unaligned_ptr = (intptr_t)memory;
intptr_t aligned_ptr = (unaligned_ptr + alignment_mask) & ~alignment_mask;
return (void*)aligned_ptr;
}
Purists might argue that treating pointers as integers is evil, but at the time of writing, they probably won't have a practical cross-platform solution to offer in exchange.
xx-byte alignment means that a the variable's memory address modulo xx is 0.
Ensuring that is a compiler-specific operation, visual c++ for example has __declspec(align(...)), which will work for variables that the compiler allocates (at file or function scope for example), alignment is somewhat harder for dynamic memory, you can use aligned_malloc for that, although your library may already guarantee 16-byte alignment for malloc, it's generally larger alignments that require such a call.
New Edit to improve and focus my answer to the specific query
To ensure data alignment in memory, there are specific functions in C to force this (assuming your data is compatible - where your data matches or discretely fits into your required alignment)
The function to use is [_aligned_malloc][1] instead of vanilla malloc.
// Using _aligned_malloc
// Note alignment should be 2^N where N is any positive int.
int alignment = 16;
ptr = _aligned_malloc('required_size', alignment);
if (ptr == NULL)
{
printf_s( "Error allocation aligned memory.");
return -1;
}
This will (if it succeeds) force your data to align on the 16 byte boundary and should satisfy the requirements for SSE.
Older answer where I waffle on about struct member alignment, which matters - but is not directly answering the query
To ensure struct member byte alignment, you can be careful how you arrange members in your structs (largest first), or you can set this (to some degree) in your compiler settings, member attributes or struct attributes.
Assuming 32 bit machine, 4 byte ints: This is still 4 byte aligned in memory (first largest member is 4 bytes), but padded to be 16 bytes in size.
struct s_Stuff {
int var1; /* 4 bytes */
short var2; /* 2 bytes */
char padding[10]; /* ensure totals struct size is 16 */
}
The compiler usually pads each member to assist with natural alignment, but the padding may be at the end of the struct too. This is struct member data alignment.
Older compiler struct member alignment settings could look similar to these 2 images below...But this is different to data alignment which relates to memory allocation and storage of the data.
It confuses me when Borland uses the phrase (from the images) Data Alignment, and MS uses Struct member alignment. (Although they both refer to specifically struct member alignment)
To maximise efficiency, you need to code for your hardware (or vector processing in this case), so lets assume 32 bit, 4 byte ints, etc. Then you want to use tight structs to save space, but padded structs may improve speed.
struct s_Stuff {
float f1; /* 4 bytes */
float f2; /* 4 bytes */
float f3; /* 4 bytes */
short var2; /* 2 bytes */
}
This struct may be padded to also align the struct members to 4 byte multiples....The compiler will do this unless you specify that it keeps single byte struct member alignment - so the size ON FILE could be 14 bytes, but still in MEMORY an array of this struct would be 16 bytes in size (with 2 bytes wasted), with an unknown data alignment (possibly 8 bytes as default by malloc but not guaranteed. As mentioned above you can force the data alignment in memory with _aligned_malloc on some platforms)
Also regarding member alignment in a struct, the compiler will use multiples of the largest member to set the alignment. Or more specifically:
A struct is always aligned to the largest type’s alignment
requirements
...from here
If you are using a UNION, you are correct that it is forced to the largest possible struct see here
Check that your compiler settings do not contradict your desired struct member alignment / padding too, or else your structs may differ in size to what you expect.
Now, why is it faster? See here which explains how alignment allows the hardware to transmit discrete chunks of data and maximises the use of the hardware that passes around data. That is, the data does not need to be split up or re-arranged at every stage - through the hardware processing
As a rule, its best to set your compiler to resonate with your hardware (and platform OS) so that your alignment (and padding) works best with your hardware processing ability. 32 bit machines usually work best with 4 byte (32 bit) member alignment, but then data written to file with 4 byte member alignment can consume more space than wanted.
Specifically regarding SSE vectors, as this link states, 4 * 4 bytes is they best way to ensure 16 byte alignment, perhaps like this. (And they refer to data alignment here)
struct s_data {
float array[4];
}
or simply an array of floats, or doubles.
struct {
uint8_t foo;
uint8_t bar;
uint8_t baz;
uint8_t foos[252];
uint8_t somethingOrOther;
} A;
struct {
uint8_t foo;
uint8_t bar;
uint8_t baz;
uint8_t somethingOrOther;
uint8_t foos[252];
} B;
Does it matter that I've put foos on byte 3 in the first example, vs on byte 4 in B?
Does an array in C have to start aligned?
Is the size of this struct 256 bytes exactly?
Given that the data type is uint8_t (equivalent to unsigned char), there is no need for padding in the structure, regardless of how it is ordered. So, you can reasonably assume in this case that every compiler will make that structure into 256 bytes, regardless of the order of the elements.
If there were data elements of different sizes, then you might well get padding added and the size of the structure might vary depending on the order of the elements.
As the good book says (C11 section 6.7.2.1 paragraph 14):
Each non-bit-field member of a structure or union object is aligned in
an implementation- defined manner appropriate to its type... There
may be unnamed padding within a structure object
You didn't "put foos on byte 3" - apart from the fact that the first element is always on byte 0, you have no real control over what byte an element will be put on. The compiler can give each field a full machine word or more if it thinks that will provide the most efficient accesses. If so, it will allocate "padding" bytes in between - unused space.
Note that this only applies to structs; arrays themselves do not add any padding bytes (if they did, the pointer arithmetic/indexing rule wouldn't work), so you can be sure that the size of foos itself is exactly the declared size, and know the exact alignment of each numbered element.
No, it doesn't matter.
Pretty much every compiler out there will align struct fields to a natural boundary appropriate for the target architecture, by inserting hidden "padding bytes" between fields where necessary.
Specifically, since you're using only uint8_t, there will be no padding bytes inserted - every field already falls on a naturally-aligned boundary. (Everything is a multiple of one.)
Both of the structs you've shown are exactly 256 bytes in size. You can confirm this like so:
int main(void)
{
printf("sizeof(struct A)=%zu \n", sizeof(struct A));
printf("sizeof(struct B)=%zu \n", sizeof(struct B));
}
You can prevent this padding from being added by "packing" the structure:
Microsoft Visual C uses #pragma pack(1)
GCC uses __attribute__((packed))
Do note that doing this on a structure with unaligned members can have a serious impact on the performance of your program.
Following GCC __attribute__(packed ) will pack to byte boundary, aligned is used for which purpose :--
u8 rx_buf[14] __attribute__((aligned(8)));
struct S { short f[3]; } __attribute__ ((aligned (8)));
above array will be of 16 byte, am I right.
means sizeof(rx_buff) will be 16 byte .. i.e 2 byte alignment at end
The answer to your question is no. The aligned attribute does not change the sizes of variables it is applied to, but the situation is slightly different for structure members. To quote the manual,
aligned (alignment)
This attribute specifies a minimum alignment for the variable or structure field, measured in bytes. For example, the declaration:
int x __attribute__ ((aligned (16))) = 0;
causes the compiler to allocate the global variable x on a 16-byte boundary.
and,
packed
The packed attribute specifies that a variable or structure field should have the smallest possible alignment — one byte for a variable,
and one bit for a field, unless you specify a larger value with the
aligned attribute.
Here is a structure in which the field x is packed, so that it immediately follows a:
struct foo
{
char a;
int x[2] __attribute__ ((packed));
};
Note that the aligned attribute may change the memory layout of structures, by inserting padding between members. Subsequently, the size of the structure will change. For instance:
struct t {
uint8_t a __attribute__((aligned(16))); /* one byte plus 15 as padding here */
uint16_t b __attribute__((aligned(16)));
};
would lead to 15 bytes of padding after a whereas the default alignment for the target architecture might have resulted in less. If you specified the packed attribute for the structure and lost the aligned attributes the structure would have a size of 3 bytes.
Here's an illustration of how the memory layout of a structure might look like in each case.
struct t with no attributes and default alignment on 8-byte boundary:
+-+-------+--+-------+
|a| |bb| |
+-+-------+--+-------+
struct t when a and b are aligned on 16-byte boundaries:
+-+---------------+--+---------------+
|a| padding |bb| padding |
+-+---------------+--+---------------+
struct t when a and b have no alignment restrictions and t is packed:
+-+--+
|a|bb|
+-+--+
above array will be of 16 byte, am I right.
Incorrect. The array is still 14 bytes long; all that __attribute__((aligned)) does is provide any necessary padding outside the array to align it to an 8-byte boundary. It is impossible to safely assume anything about where this padding exists, or how much of it there is.
As such, sizeof(rx_buf) will remain 14, just as it would have been without the alignment.
I want to know what is boundary problem with respect to allocation of size of structures?
Any keyword for the same that I can google shall be helpful.
To calculate the sizes of user-defined types, the compiler takes into account any alignment space needed for complex user-defined data structures. This is why the size of a structure in C can be greater than the sum of the sizes of its members. For example, on many systems, the following code will print 8:
struct student{
char grade; /* char is 1 byte long */
int age; /* int is 4 bytes long */
};
printf("%zu", sizeof (struct student));
The reason for this is that most compilers, by default, align complex data-structures to a word alignment boundary. In addition, the individual members are also aligned to their respective alignment boundaries. By this logic, the structure student gets aligned on a word boundary and the variable age within the structure is aligned with the next word address. This is accomplished by way of the compiler inserting "padding" space between two members or to the end of the structure to satisfy alignment requirements. This padding is inserted to align age with a word boundary. (Most processors can fetch an aligned word faster than they can fetch a word value that straddles multiple words in memory, and some don't support the operation at all)
Referenced article: data structure alignment and Structure padding