I have been reading data structure alignment articles but I'm getting nowhere. Perhaps things are just too complicated for me to understand. I also came across data structure padding which is also necessary to align data. How do I add a data structure padding to struct usb_ep? Also how do I make sure that whenever I perform kmalloc the data to be read should be at a memory offset which is some multiple of 4?
Regarding alignment, kmalloc will align the structures properly. If you have an 4byte variable, it will be 4bytes aligned, if you have an 8byte vaiable, it will be 8bytes aligned. Understanding alignment is the reason why padding is needed.
What you dont want to get is garbade padding between the variables in your struct. You can do that with the pragma pack directive (probably easiest) or by adding the padding manually.
Example
struct usb_ep
{
short a; /* 2 bytes*/
int b; /* 4 bytes*/
short c; /* 2 bytes*/
};
The size of all the elements is 8bytes, but due to alignment requirements, the size will be 12bytes. Memory layout would be like this:
short a - 2 bytes
char pad[2] - 2 bytes of padding
int b - 4 bytes
short c - 2 bytes
char pad[2] - 2 bytes of padding
In order to not get any padding, or increasing the size of the struct, you can rearrange elements in order to satisfy the alignment requirements.
That is having a struct:
struct usb_ep
{
short a; /* 2 bytes*/
short c; /* 2 bytes*/
int b; /* 4 bytes*/
};
Will have the size of 8bytes, and no requirement for adding padding.
This comes from http://minirighi.sourceforge.net/html/kmalloc_8c.html
void * kmemalign (size_t alignment, size_t size)
Allocate some memory aligned to a boundary.
Parameters:
alignment The boundary.
size The size you want to allocate.
Exceptions:
NULL Out-of-memory.
Returns:
A pointer to a memory area aligned to the boundary. The pointer is a aligned_mem_block_t pointer, so if you want to access to the data area of this pointer you must specify the p->start filed.
Note:
Use kfree(void *ptr) to free the allocated block.
The best way to pad fields in a structure is to declare your variables in descending size. So your largest ones first, then down to the smallest.
struct example {
double amount;
char *name;
int cnt;
char is_valid;
};
This doesn't always end up with logically connected items in the structure, but will typically give the most compact and easily accessible memory usage.
You can use use padding bytes in your struct declarations, but they clutter up the code, and do not guarantee compact structures. A compiler may align every byte on a 4 byte boundary, so you might end up with
struct example2 {
char a;
char padding1[3];
char b;
char padding2[3];
};
taking 4 bytes for a, 4 bytes for padding1, 4 bytes for b, and 4 bytes for padding2. Some compilers allow you to specify packed structures which would yield the correct result in this case. Usually I just declare the fields from largest to smallest types and leave it at that. If you need to share memory between two langages/compilers, then you need to make sure the structs align identically in memory.
Related
struct {char c; int i;} A[100];
struct {char c[100]; int i[100];} B;
I need to minimize the amount of memory used. Which of these uses less and why?
The second one typically uses less.
The first one, due to alignment constraints, is likely to make an eight byte struct, using 800 bytes total for 100 copies of it; one byte for the char, three bytes for padding, four for the int.
The second one doesn't need padding for alignment (the char array ends at a four byte alignment), and just allocates 100 bytes, then four hundred bytes for a total of 500 bytes of memory used.
This assumes int is four bytes, which is by far the most common case; adjust as needed for weirdo systems.
I have a structure of the following type:
typedef struct
{
unsigned char A;
unsigned long int B;
unsigned short int C;
}
According to the alignment requirements of each basic data type and the alignment requirement of the whole structure, the allocation in memory will be like that:
My question is what is the importance of those trailing padding bytes as long as each structure member is naturally aligned to its size and could be accessed by our processor in one cycle (assuming that the bus size of processor is 32-bit) without alignment faults.
Also, if we declared an array of "2" of this structure, without taking into consideration the trailing bytes, the allocation in memory will be as following:
Each member in the two structures is naturally aligned to its size and could be accessed in one cycle without alignment faults.
So, what is the importance of trailing bytes in this case ?!
The comments from Bryan Olivier and Hans Passant are both right.
Essentially, you have answered your own question: In the 2nd drawing, the alignment of the members of both the first and second array item are correct. If the compiler could layout structures like this there would be no importance to the trailing pad bytes. But it can't.
In C, a structure's layout and size must be the same for every instance of a structure. In your second example, the sizeof(array[0]) is 10 and sizeof(array[1]) is 8. The address of B2 is only two greater than A2, but &B1 is four greater than &A1.
It's more than just alignment - it's ensuring a constant layout and size while still ensuring alignment. Even then it would not require trailing pad bytes if the first byte was aligned, but as you have noticed if you add arrays, then you need them.
Alignment+Layout/Size+Arrays => trailing pad is required.
As Andres said, the compiler can't generate special layout to every member in the array of structures.
For example, assume that the programmer has defined a structure with the following type
typedef struct
{
unsigned char A;
unsigned long int B;
unsigned short int C;
} myStructureType;
And then the programmer has created an instance of this type:
myStructureType myStructure;
The compiler will allocate memory to myStructure as following:
If the programmer has decided to create an array of myStructureType, the compiler will repeat the previous pattern in memory by the number of array's elements as following:
If the compiler neglects the trailing padding bytes, the memory will become misaligned as following
32-bit modern processor will need two memory cycles and some masking operations to fetch B1 (the element B in the index "1" of the array of structures). However, an old processor will fire an alignment faults.
That's why the trailing padding was important in this case
#include<stdio.h>
struct std
{
char c;
char e;
};
void main()
{
struct std p,q;
int start,last;
last= &q;
start= &p;
printf("Size:%u %u Address_Diff:%d \n",&p, &q,(last-start) );
}
I am learning structure alignment and padding while we executing above program every time difference between two structure variables showing 16 bytes. I am not understanding why the difference is 16 bytes but i'm expecting the difference is 2 bytes.
some examples:
4077309424 4077309440 diff:16
3600238672 3600238688 diff:16
3272011008 3272011024 diff:16
Generally compiler may pad bytes, to keep each structure variable start position to word or double-word alignment. I guess the choice of alignment position may be architecture and compiler dependent.
It seems your compiler is keeping 16-byte alignment for each structure variable. Sometimes compiler may pad bytes to elements of structure to keep word or double-word alignment. You can check where the bytes padding happening, I mean bytes or padded at element level or structure level.
Do print each element address (as below) and see the difference between them.
printf("Size:%u %u Address_Diff:%d \n",&p.c, &p.e);
If difference is just one byte, it means that bytes are padded at structure level i.e., at the end of structure.
C question:
Does malloc'ing a struct always result in linear placement from top to bottom of the data inside? As a second minor question: is there a standard on the padding size, or does it vary between 32 and 64 bit machines?
Using this test code:
#include <stdio.h>
#include <stdlib.h>
struct test
{
char a;
/* char pad[3]; // I'm assuming this happens from the compiler */
int b;
};
int main() {
int size;
char* testarray;
struct test* testarraystruct;
size = sizeof(struct test);
testarray = malloc(size * 4);
testarraystruct = (struct test *)testarray;
testarraystruct[1].a = 123;
printf("test.a = %d\n", testarray[size]); // Will this always be test.a?
free(testarray);
return 0;
}
On my machine, size is always 8. Therefore I check testarray[8] to see if it's the second struct's 'char a' field. In this example, my machine returns 123, but this obviously isn't proof it always is.
Does the C compiler have any guarantees that struct's are created in linear order?
I am not claiming the way this is done is safe or the proper way, this is a curiosity question.
Does this change if this is becomes C++?
Better yet, would my memory look like this if it malloc'd to 0x00001000?
0x00001000 char a // First test struct
0x00001001 char pad[0]
0x00001002 char pad[1]
0x00001003 char pad[2]
0x00001004 int b // Next four belong to byte b
0x00001005
0x00001006
0x00001007
0x00001008 char a // Second test struct
0x00001009 char pad[0]
0x0000100a char pad[1]
0x0000100b char pad[2]
0x0000100c int b // Next four belong to byte b
0x0000100d
0x0000100e
0x0000100f
NOTE: This question assumes int's are 32 bits
As far as I know, malloc for struct is not linear placement of data but it's a linear allocation of memory for the members with in the structure that too when you create an object of it.
This is also necessary for padding.
Padding also depends on the type of machine (i.e 32 bit or 64 bit).
The CPU fetches the memory based on whether it is 32 bit or 64 bit.
For 32 bit machine your structure will be:
struct test
{
char a; /* 3 bytes padding done to a */
int b;
};
Here your CPU fetch cycle is 32 bit i.e 4 bytes
So in this case (for this example) the CPU takes two fetch cycles.
To make it more clear in one fetch cycle CPU allocates 4 bytes of memory. So 3 bytes of padding will be done to "char a".
For 64 bit machine your structure will be:
struct test
{
char a;
int b; /* 3 bytes padding done to b */
}
Here the CPU fetch cycle is 8 bytes.
So in this case (for this example) the CPU takes one fetch cycles. So 3 bytes of padding here must be done to "int b".
However you can avoid the padding you can use #pragma pack 1
But this will not be efficient w.r.t time because here CPU fetch cycles will be more (for this example CPU fetch cycles will be 5).
This is tradeoff between CPU fetch cycles and padding.
For many CPU types, it is most efficient to read an N-byte quantity (where N is a power of 2 — 1, 2, 4, 8, sometimes 16) when it is aligned on an N-byte address boundary. Some CPU types will generate a SIGBUS error if you try to read an incorrectly aligned quantity; others will make extra memory accesses as necessary to retrieve an incorrectly aligned quantity. AFAICR, the DEC Alpha (subsequently Compaq and HP) had a mechanism that effectively used a system call to fix up a misaligned memory access, which was fiendishly expensive. You could control whether that was allowed with a program (uac — unaligned access control) which would stop the kernel from aborting the process and would do the necessary double reads.
C compilers are aware of the benefits and costs of accessing misaligned data, and go to lengths to avoid doing so. In particular, they ensure that data within a structure, or an array of structures, is appropriately aligned for fast access unless you hold them to ransom with quasi-standard #pragma directives like #pragma pack(1).
For your sample data structure, for a machine where sizeof(int) == 4 (most 32-bit and 64-bit systems), then there will indeed be 3 bytes of padding after an initial 1 byte char field and before a 4-byte int. If you use short s; after the single character, there would be just 1 byte of padding. Indeed, the following 3 structures are all the same size on many machines:
struct test_1
{
char a;
/* char pad[3]; // I'm assuming this happens from the compiler */
int b;
};
struct test_2
{
char a;
short s;
int b;
};
struct test_3
{
char a;
char c;
short s;
int b;
};
The C standard mandates that the elements of a structure are laid out in the sequence in which they are defined. That is, in struct test_3, the element a comes first, then c, then s, then b. That is, a is at the lowest address (and the standard mandates that there is no padding before the first element), then c is at an address higher than a (but the standard does not mandate that it will be one byte higher), then s is at an address higher than c, and that b is at an address higher than s. Further, the elements cannot overlap. There may be padding after the last element of a structure. For example in struct test_4, on many computers, there will be 7 bytes of padding between a and d, and there will be 7 bytes of padding after b:
struct test_4
{
char a;
double d;
char b;
};
This ensures that every element of an array of struct test_4 will have the d member properly aligned on an 8-byte boundary for optimal access (but at the cost of space; the size of the structure is often 24 bytes).
As noted in the first comment to the question, the layout and alignment of the structure is independent of whether the space is allocated by malloc() or on the stack or in global variables. Note that malloc() does not know what the pointer it returns will be used for. Its job is simply to ensure that no matter what the pointer is used for, there will be no misaligned access. That often means the pointer returned by malloc() will fall on an 8-byte boundary; on some 64-bit systems, the address is always a multiple of 16 bytes. That means that consecutive malloc() calls each allocating 1 byte will seldom produce addresses 1 byte apart.
For your sample code, I believe that standard does require that testdata[size] does equal 123 after the assignment. At the very least, you would be hard-pressed to find a compiler where it is not the case.
For simple structures containing plain old data (POD — simple C data types), C++ provides the same layout as C. If the structure is a class with virtual functions, etc, then the layout rules depend on the compiler. Virtual bases and the dreaded 'diamond of death' multiple inheritance, etc, also make changes to the layout of structures.
I want to know what is boundary problem with respect to allocation of size of structures?
Any keyword for the same that I can google shall be helpful.
To calculate the sizes of user-defined types, the compiler takes into account any alignment space needed for complex user-defined data structures. This is why the size of a structure in C can be greater than the sum of the sizes of its members. For example, on many systems, the following code will print 8:
struct student{
char grade; /* char is 1 byte long */
int age; /* int is 4 bytes long */
};
printf("%zu", sizeof (struct student));
The reason for this is that most compilers, by default, align complex data-structures to a word alignment boundary. In addition, the individual members are also aligned to their respective alignment boundaries. By this logic, the structure student gets aligned on a word boundary and the variable age within the structure is aligned with the next word address. This is accomplished by way of the compiler inserting "padding" space between two members or to the end of the structure to satisfy alignment requirements. This padding is inserted to align age with a word boundary. (Most processors can fetch an aligned word faster than they can fetch a word value that straddles multiple words in memory, and some don't support the operation at all)
Referenced article: data structure alignment and Structure padding