Structure alignment issue

Structure alignment issue - c

My Compiler require me to have a memory aligned structure declaration, to ensure proper data access.
I have a top structure, which comprised of some other structures. Is it sufficient to ensure that the top structure to be aligned to 32 byte boundary or I need to ensure that each structure should be aligned to 32 byte boundary.
Code snippet is below:-
typedef struct {
int p;
int q;
char n;
} L;
typedef struct {
int c;
int d;
char e;
L X2[13];
} B;
typedef struct {
int a;
int b;
B X1[10];
} M;
To ensure correct data access, Do I need to ensure that all structures are memory aligned properly, or padding the top most structure will ensure memory alignment.

Sometimes your application may require specific layout, but if as you say in this case it is a requirement of your compiler (or probably more accurately the target architecture of your compiler), then it is the compiler's responsibility to ensure those requirements are met.
If you require alignment other than that which the compiler will enforce naturally as required by the target, you will need compiler specific directives for packing and alignment; however applying such directives and getting it wrong is far more likley to cause an alignment fault than letting the compiler handle it. If you attempt to align by adding your own padding members, it may work, but is unnecessary and the compiler may insert additional padding of its own too.
The point is the compiler will not generate a structure with members it cannot safely and efficiently address. It will insert any necessary padding between members to ensure that subsequent members are addressable.
If you don't believe it will work, get your linker to output a map file (if it does not do so already) and check the address of these symbols to verify correct alignment. Also look at the generated size of the structures; you may find that some of them are larger than the sum of their parts - that is the compiler forcing alignment by inserting padding.

If you can use it, C11 has alignment statements if your application (or architecture, or performance) needs it : http://en.wikipedia.org/wiki/C11_%28C_standard_revision%29
Alignment specification (_Alignas specifier, alignof operator, aligned_alloc function, header file)
GCC surely has extensions also.

If you want to use natural alignment you have to order the
fields and pad manually to the natural word size. Portability
is not guaranteed but for a specific processor this can be done
(not uncommon in the embedded world). If sizeof(int) is 4
you have to add padding to your sub-structs to ensure alignment
in the arrays (I assume that your goal is to avoid "secret" padding
added by the compiler?). For example :
typedef struct {
int p;
int q;
char n;
char pad[3];
} L;
typedef struct {
int c;
int d;
char e;
char pad[3];
L X2[13];
} B;
typedef struct {
int a;
int b;
B X1[10];
} M;
would normally cause no "hidden" alignment in the structures.

Related

Memory Alignment warning with gcc

I'm trying to implement a polymorphic data structure e.g. an intrusive linked list (I already know the kernel has one - this is more of learning experience).
The trouble is casting a nested struct to the containing struct leads to gcc issuing a memory-alignment warning.
The specifics are provided below:
// t.c
#include <stdio.h>
enum OL_TYPE { A, B, C };
struct base {
enum OL_TYPE type;
};
struct overlay {
struct base base;
int i;
};
struct overlay2 {
struct base base;
float f;
int i;
// double d; // --> adding this causes a memory alignment warning
};
void testf(struct base *base) {
if (base->type == A) {
struct overlay *olptr = (struct overlay *)base;
printf("overlay->i = %d\n", olptr->i);
} else
if (base->type == B) {
struct overlay2 *olptr = (struct overlay2 *)base;
printf("overlay->i = %d\n", olptr->i);
}
}
int main(int argc, char *argv[]) {
struct overlay ol;
ol.base.type = A;
ol.i = 3;
testf(&ol.base);
}
Compiled with gcc t.c -std=c99 -pedantic -fstrict-aliasing -Wcast-align=strict -O3 -o q leads to this:
t.c: In function ‘testf’:
t.c:28:34: warning: cast increases required alignment of target type [-Wcast-align]
28 | struct overlay2 *olptr = (struct overlay2 *)base;
| ^
It's interesting to notice that if I comment out the double from overlay2 such that it's not part of the structure anymore, the warning disappears.
In light of this, I have a few questions:
Is there any danger to this?
If yes, what is the danger? If not, does that mean the warning is a false positive and how can it be dealt with?
Why does adding the double suddenly lead to the alignment warning?
Is misalignment / an alignment problem even possible if I cast from base to a struct overlay2 * IF the actual type IS in fact a struct overlay2 with a nested struct base ? Please explain!
Appreciate any answers that can provide some insight

is there any danger to this?
Only if what base points to was not allocated as a struct overlay2 in the first place. If you're just smuggling in a pointer to struct overlay2 as a struct base*, and casting back internally, that should be fine (the address would actually be aligned correctly in the first place).
If yes, what is the danger? If not, does that mean the warning is a false positive and how can it be dealt with?
The danger occurs when you allocate something as something other than struct overlay2 then try to use the unaligned double inside it. For example, a union of struct base and char data[sizeof(struct overlay2)] would be the right size to contain a struct overlay2, but it might be four byte aligned but not eight byte aligned, causing the double (size 8, alignment 8) to be misaligned. On x86 systems, a misaligned field typically just slows your code, but on non-x86 misaligned field access can crash your program.
As for dealing with it, you can silence the warning (with some compiler-specific directives), or you can just force all your structures to be aligned identically, by adding alignas directives (available since C11/C++11):
#include <stdalign.h> // Add at top of file to get alignas macro
struct base {
alignas(double) enum OL_TYPE type;
};
why does adding the double suddenly lead to the alignment warning?
Because it's the only data type in any of the structures that requires eight byte alignment; without it, the structures can all be legally aligned to four bytes; with it, overlay2 needs eight byte alignment while the others only need four byte alignment.
is an alignment even possible if I cast from base to a struct overlay2 * IF the actual type IS in fact a struct overlay2 with a nested struct base ?
As noted above, if it was really pointing to something allocated as a struct overlay2 all along, you're safe.

It's not really a false positive, because the compiler doesn't know that your passed base* argument will be the address of an actual overlay2 structure (even though you do). However, if that condition is always going to be true, then no issue should arise due to inappropriate alignment.
However, to be completely sure (and to silence the warning), you could always make the base structure align to the requirements of a double, if that doesn't cause any other issues:
struct base {
_Alignas(double) enum OL_TYPE type;
};
Why does adding the double suddenly lead to the alignment warning
Because, in general, the alignment requirement of any structure must be at least that of the largest alignment requirement of all its members. Without such an assurance, an array of those structures would – by definition – result in misalignment of such members in any one one of two adjacent elements in the array.

Some processors require that pointers to an item of size S must be aligned to the next power-of-two greater than or equal to S, otherwise you get memory alignment faults.
If you take pointer to a variable (including aggregate) that's lesser alignment (lower power-of-two) and cast it to the type of a variable that has higher alignment requirements, you potentially set yourself up for those alignment faults.
If this is likely to be an issue, one option is to put all the item types that will be cast between into a union, always allocate that union, and then reference the type within the union.
That has the potential to get expensive on memory, so another possibility is to create a copy function that initialises a greater-alignment object from the contents of the lesser-alignment one, ensuring any additional values the larger one has will be set to sensible default values.
Finally, and since we're talking about linked lists anyway, the better option is to have a single struct type that comprises the linked items. It will also contain a void * or union thing * pointer to the actual data and an indication of what type of data the pointer refers to. This way you're pretty economical with memory, you don't have to worry about alignment issues, and you have the cleanest and most adaptable representation.

How can size of a structure be a non-multiple of 4?

I'm new to structures and was learning how to find the size of structures. I'm aware of how padding comes in to play in order to properly align the memory. From what I've understood, the alignment is done so that the size in memory comes out to be a multiple of 4.
I tried the following piece of code on GCC.
struct books{
short int number;
char name[3];
}book;
printf("%lu",sizeof(book));
Initially I had thought that the short int would occupy 2 bytes, followed by the character array starting at the third memory location from the beginning. The character array then would need a padding of 3 bytes which would give a size of 8. Something like this, where each word represents a byte in memory.
short short char char
char padding padding padding
However on running it gives a size of 6, which confuses me.
Any help would be appreciated, thanks!

Generally, padding is inserted to allow for aligned access of the internal elements of the structure, not to allow the entire structure to be a size of multiple words. Alignment is a compiler implementation issue, not a requirement of the C standard.
So, the char elements which are 3 bytes in length, need no alignment because they are byte elements.
It is preferred, though not required, that the short element needs to be aligned on a short boundary -- which means an even address. By aligning it on a short boundary, the compiler can issue a single load short instruction rather than having to load a word, mask, and then shift.
In this case, the padding is probably, but not necessarily, happening at the end rather than in the middle. You will have to write code to dump the address of the elements to determine where padding is taking place.
EDIT: . As #Euguen Sh mentions, even if you discover the padding scheme that the compiler is using for the structure, the compiler could modify that in a different version of the compiler.
It is unwise to count on the padding scheme of the compiler. There are always methods to access the elements in such a way that you do not guess at alignments.
The sizeof() operator is used to allow you to see how much memory is used AND to know how much will be added to a ptr to the structure if that pointer is incremented by 1 (ptr++).
EDIT 2, Packing: Structures may be packed to prevent padding using the __packed__ attribute. When designing a structure, it is wise to use elements that naturally pack. This is especially important when sending data over a communications link. A carefully designed structure avoids the need for padding in the middle of the strucuture. A poorly designed structure which is then compiled with the __packed__ attribute may have internal elements that are not naturally aligned. One might do this to ensure that the structure will transmit across a wire as it was originally designed. This type of effort has diminished with the introduction of JSON for transmission of data over a wire.

#include <stdalign.h>
#include <assert.h>
The size of a struct is always divisible by the maximum alignment of the members (which must be a power of two).
If you have a struct with char and short the alignment is 2, because the alignment of short is two, if you have a struct, only out of chars it has an alignment of 1.
There are multiple ways to manipulate the alignment:
alignas(4) char[4]; // this can hold 32-bit ints
This is nonstandart, but available in most compilers (GCC, Clang, ...):
struct A {
char a;
short b;
};
struct __attribute__((packed)) B {
char a;
short b;
};
static_assert(sizeof(struct A) == 4);
static_assert(alignof(struct A) == 2);
static_assert(sizeof(struct B) == 3);
static_assert(alignof(struct B) == 1);

Usually compilers follow ABI of the target architecture.
It defines alignments of structures and primitive datatypes. And that affects to needed padding and sizes of structures. Because alignment is multiple of 4 in many architectures, size of structures are too.
Compilers may offer some attributes/options for changing alignments more or less directly.
For example gcc and clang offers: __attribute__ ((packed))

Memory - Natural address boundary

Definition
Structure padding is the process of aligning data members of the structure in accordance with the memory alignment rules specified by the processor.
what is the memory alignment rule for Intel x86 processor?
As per my understanding, natural address boundaries for Intel-x86 processor is 32 bits each(i.e.,addressOffset%4==0)
So, In x86 processor,
struct mystruct_A {
char a;
int b;
char c;
};
will be constructed as,
struct mystruct_A {
char a;
char gap_0[3]; /* inserted by compiler: for alignment of b using array */
int b;
char c;
char gap_1[3]; /* for alignment of the whole struct using array */
};
what is the memory alignment rule for Intel x86-64 processor?
As per my understanding, natural address boundaries for Intel x86-64 processor is 64 bits each(i.e.,addressOffset%8==0)
So, In x86-64 processor,
struct mystruct_A {
char a;
int b;
char c;
};
will be constructed as,
struct mystruct_A {
char a;
char gap_0[7]; /* inserted by compiler: for alignment of b using array */
int b;
char c;
char gap_1[7]; /* for alignment of the whole struct using array */
};
If the above understanding is correct, then I would like to know why use an array of int for bit operation?
Recommends to use int sized data, as mentioned here, that says, because the most cost efficient access to memory is accessing int sized data.
Question:
Is this memory alignment rule that forces to declare int sized data for bit operations?

Addendum: this is valid for x86/-64 bit processors, but also for others. I am blindly assuming you're using those. For others, you should check the respective manuals.
If fasm automatically added fillers into my structs i'd go insane. In general, performance is better when accesses to memory are on a boundary corresponding to the size of the element you want to retrieve. That being said, it's not a definite necessity!
This article here might be worth a look: https://software.intel.com/en-us/articles/coding-for-performance-data-alignment-and-structures
Intel's suggestion for optimal layout is to start with the biggest elements first and going smaller as the structure increases. That way you'll stay aligned properly, as long as the first element is aligned properly. There are no three-byte elements, thus misalignment is out of the question and all the compiler might do is adding bytes at the end, which is the best way to make sure it won't ruin things if you choose to do direct memory accesses instead of using variables.
The safest procedure is to not rely on your compiler, but instead aligning the data properly yourself.
Fun Fact: loops work the same way. Padding NOPs in your code, before the start of a loop, can make a difference.

What is there to be gained by deterministic field ordering in the memory layout?

Members of a structure are allocated within the structure in the order of their appearance in the declaration and have ascending addresses.
I am faced with the following dilemma: when I need to declare a structure, do I
(1) group the fields logically, or
(2) in decreasing size order, to save RAM and ROM size?
Here is an example, where the largest data member should be at the top, but also should be grouped with the logically-connected colour:
struct pixel{
int posX;
int posY;
tLargeType ColourSpaceSecretFormula;
char colourRGB[3];
}
The padding of a structure is non-deterministic (that is, is implementation-dependent), so we cannot reliably do pointer arithmetic on structure elements (and we shouldn't: imagine someone reordering the fields to his liking: BOOM, the whole code stops working).
-fpack-structs solves this in gcc, but bears other limitations, so let's leave compiler options out of the question.
On the other hand, code should be, above all, readable. Micro optimizations are to be avoided at all cost.
So, I wonder, why are structures' members ordered by the standard, making me worry about the micro-optimization of ordering struct member in a specific way?

The compiler is limited by several traditional and practical limitations.
The pointer to the struct after a cast (the standard calls it "suitably converted") will be equal to the pointer to the first element of the struct. This has often been used to implement overloading of messages in message passing. In that case a struct has the first element that describes what type and size the rest of the struct is.
The last element can be a dynamically resized array. Even before official language support this has been often used in practice. You allocate sizeof(struct) + length of extra data and can access the last element as a normal array with as many elements that you allocated.
Those two things force the compiler to have the first and last elements in the struct in the same order as they are declared.
Another practical requirement is that every compilation must order the struct members the same way. A smart compiler could make a decision that since it sees that some struct members are always accessed close to each other they could be reordered in a way that makes them end up in a cache line. This optimization is of course impossible in C because structs often define an API between different compilation units and we can't just reorder things differently on different compilations.
The best we could do given the limitations is to define some kind of packing order in the ABI to minimize alignment waste that doesn't touch the first or last element in the struct, but it would be complex, error prone and probably wouldn't buy much.

If you couldn't rely on the ordering, then it would be much harder to write low-level code which maps structures onto things like hardware registers, network packets, external file formats, pixel buffers, etc.
Also, some code use a trick where it assumes that the last member of the structure is the highest-addressed in memory to signify the start of a much larger data block (of unknown size at compile time).

Reordering fields of structures can sometime yield good gains in data size and often also in code size, especially in 64 bit memory model. Here an example to illustrate (assuming common alignment rules):
struct list {
int len;
char *string;
bool isUtf;
};
will take 12 bytes in 32 bit but 24 in 64 bit mode.
struct list {
char *string;
int len;
bool isUtf;
};
will take 12 bytes in 32 bit but only 16 in 64 bit mode.
If you have an array of these structures you gain 50% in the data but also in code size, as indexing on a power of 2 is simpler than on other sizes.
If your structure is a singleton or not frequent, there's not much point in reordering the fields. If it is used a lot, it's a point to look at.
As for the other point of your question. Why doesn't the compiler do this reordering of fields, it is because in that case, it would be difficult to implement unions of structures that use a common pattern. Like for example.
struct header {
enum type;
int len;
};
struct a {
enum type;
int len;
bool whatever1;
};
struct b {
enum type;
int len;
long whatever2;
long whatever4;
};
struct c {
enum type;
int len;
float fl;
};
union u {
struct h header;
struct a a;
struct b b;
struct c c;
};
If the compiler rearranged the fields, this construct would be much more inconvenient, as there would be no guarantee that the type and len fields were identical when accessing them via the different structs included in the union.
If I remember correctly the standard even mandates this behaviour.

sizeof sideeffect and allocation location [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Why isn’t sizeof for a struct equal to the sum of sizeof of each member?
I can not understand why is it like this:
#include <stdio.h>
#include <stdlib.h>
typedef struct
{
char b;
int a;
} A;
typedef struct
{
char b;
} B;
int main() {
A object;
printf("sizeof char is: %d\n",sizeof(char));
printf("sizeof int is: %d\n",sizeof(int));
printf("==> the sizeof both are: %d\n",sizeof(int)+sizeof(char));
printf("and yet the sizeof struct A is: %d\n",sizeof(object));
printf("why?\n");
B secondObject;
printf("pay attention that the sizeof struct B is: %d which is equal to the "
"sizeof char\n",sizeof(secondObject));
return 0;
}
I think I explained my question in the code and there is no more need to explain. besides I have another question:
I know there is allocation on the: heap/static heap/stack, but what is that means that the allocation location is unknown, How could it be ?
I am talking about this example:
typedef struct
{
char *_name;
int _id;
} Entry;
int main()
{
Entry ** vec = (Entry**) malloc(sizeof(Entry*)*2);
vec[0] = (Entry *) malloc(sizeof (Entry));
vec[0]->_name = (char*)malloc(6);
strcpy (vec[0]->_name, "name");
vec[0]->_id = 0;
return 0;
}
I know that:
vec is on the stack.
*vec is on the heap.
*vec[0] is on the heap.
vec[0]->id is on the heap.
but :
vec[0]->_name is unknown
why ?

There is an unspecified amount of padding between the members of a structure and at the end of a structure. In C the size of a structure object is greater than or equal to the sum of the size of its members.

Take a look at this question as well as this one and many others if you search for CPU and memory alignment. In short, CPUs are happier if they access the memory aligned to the size of the data they are reading. For example, if you are reading a uint16_t, then it would be more efficient (on most CPUs) if you read at an address that is a multiple of 2. The details of why CPUs are designed in such a way is whole other story.
This is why compilers come to the rescue and pad the fields of the structures in such a way that would be most comfortable for the CPU to access them, at the cost of extra storage space. In your case, you are probably given 3 byte of padding between your char and int, assuming int is 4 bytes.
If you look at the C standard (which I don't have nearby right now), or the man page of malloc, you will see such a phrase:
The malloc() and calloc() functions return a pointer to the allocated memory
that is suitably aligned for any kind of variable.
This behavior is exactly due to the same reason I mentioned above. So in short, memory alignment is something to care about, and that's what compilers do for you in struct layout and other places, such as layout of local variables etc.

You're running into structure padding here. The compiler is inserting likely inserting three bytes' worth of padding after the b field in struct A, so that the a field is 4-byte aligned. You can control this padding to some degree using compiler-specific bits; for example, on MSVC, the pack pragma, or the aligned attribute on GCC, but I would not recommend this. Structure padding is there to specify member alignment restrictions, and some architectures will fault on unaligned accesses. (Others might fixup the alignment manually, but typically do this rather slowly.)
See also: http://en.wikipedia.org/wiki/Data_structure_alignment#Data_structure_padding
As to your second question, I'm unsure what you mean by the name is "unknown". Care to elaborate?

The compiler is free to add padding in structures to ensure that datatypes are aligned properly. For example, an int will be aligned to sizeof(int) bytes. So I expect the output for the size of your A struct is 8. The compiler does this, because fetching an int from an unaligned address is at best inefficient, and at worst doesn't work at all - that depends on the processor that the computer uses. x86 will fetch happily from unaligned addresses for most data types, but will take about twice as long for the fetch operation.
In your second code-snippet, you haven't declared i.
So vec[0]->_name is not unknown - it is on the heap, just like anything else you get from "malloc" (and malloc's siblings).

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight