Nested struct memory layout in c - c

I have a stuct a of size 16 bytes and another struct b which contains a.
Why is struct b of size 40 bytes? Where is the additional padding exactly?
typedef struct {
} a;
typedef struct {
a x;
} b;

The struct a has 4-byte alignment since that's the largest alignment of any of its members (i.e. float).
From there, the fields of b are laid out with the following offsets:
w: 0
padding: 1 - 3
x: 4 - 19
padding: 20 - 23
y: 24 - 31
z: 32 - 33
padding: 34 - 39
The member with the largest alignment is y which has 8-byte alignment. This results in 4 bytes of padding between x and y, as well as 6 bytes of padding at the end.
The padding can be minimized by moving z to between w and x. Then you would have:
w: 0
padding: 1
z: 2 - 4
x: 4 - 19
padding: 20 - 23
y: 24 - 31
For a total size of 32.
Alignment is of course entirely up to the implementation, but this is what you'll most likely see. The Lost Art of C Structure Packing goes into this in much more detail.

The algorithm compilers typically use to lay out structures is described in this answer.
For your structure a and characteristics typical in current C implementations (described below):
The char w has an alignment requirement of 1 byte, so it needs no padding and is placed at offset 1. It occupies 1 byte.
The float x has an alignment requirement of 4 bytes, so 3 bytes are needed to bring the current offset from 1 byte to 4 bytes. Then it occupies 4 bytes, giving us 8 total so far.
The short int y has an alignment requirement of 2 bytes, so it needs no padding from the current offset of 8 bytes. It occupies 2 bytes, bringing the total to 10.
The float z has an alignment requirement of 4 bytes, so it needs 2 bytes to bring the offset from 10 bytes to 12 bytes. It occupies 4 bytes, bringing the total to 16.
The alignment requirement of the structure is 4 bytes (the strictest alignment requirement of its members), and the current total is 16 bytes, which is already a multiple of 4, so no padding at the end is needed.
Thus the size of a is 16 bytes.
For your structure b:
The char w has an alignment requirement of 1 byte, so it needs no padding and is placed at offset 1. It occupies 1 byte.
The a x has an alignment requirement of 4 bytes, so 3 bytes of padding are needed to bring the offset from 1 byte to 4 bytes. It occupies 16 bytes, bringing the total to 20.
The double y has an alignment requirement of 8 bytes, so 4 bytes of padding are needed to bring the offset from 20 bytes to 24. It occupies 8 bytes, bringing the total to 32.
The short int z has an alignment requirement of 2 bytes, so no padding is needed from the current offset of 32 bytes. It occupies 2 bytes, bringing the total to 34.
The alignment requirement of the structure is 8 bytes (the strictest alignment requirement of its members), so 6 bytes of padding is needed to bring the total size from 34 bytes to 40.
Thus the size of b is 40 bytes.
The characteristics used are:
char has size 1 byte and alignment requirement 1 byte. This is a fixed property of the C standard.
short int has size 2 bytes and alignment requirement 2 bytes.
float has size 4 bytes and alignment requirement 4 bytes.
double has size 8 bytes and alignment requirement 8 bytes. (Having an alignment requirement of 4 bytes instead of 8 would not be very weird; hardware might load and store doubles in 32-bit chunks and not care whether either of them were 8-byte aligned.)
The C implementation does not use more padding or stricter alignment than required by the sizes and alignment requirements of structure members. (This is not mandated by the C standard, but there is generally no reason to do otherwise.)

You seem to understand why the size of a is 16 bytes, and I'll also assume that you see why a has an alignment requirement of 4 bytes (see the note about arrays in the discussion below on the alignment of b). From that, we can see that, in b, we will have 3 bytes of padding between w and x.
Now, on your platform (as on many/most), a double has a size and alignment requirement of 8 bytes – so, after the 20 bytes up to the end of x (1 for w, 3 for padding and 16 for x), we need to add 4 bytes padding between x and y, to align that double. Thus, we have now added 12 bytes to our total, giving a running size (up to y) of 32 bytes.
The z member takes another 2 bytes (running size of 34) but, as the alignment requirement of the overall b structure must be 8 bytes (so that, in an array of such, every element's y will remain properly aligned), we need a further 6 bytes to reach a size (40) that is a multiple of 8.
Here is a breakdown of the memory layout:
typedef struct {
char w; // 1 byte: total = 1
// 3 bytes padding: total = 4
float x; // 4 bytes: total = 8
short int y;// 2 bytes: total = 10
// 2 bytes padding: total = 12
float z; // 4 bytes: TOTAL = 16
} a;
typedef struct {
char w; // 1 byte: total = 1
// 3 bytes padding: total = 4
a x; // 16 bytes: total = 20
// 4 bytes padding: total = 24
double y; // 8 bytes: total = 32
short int z;// 2 bytes: total = 34
// 6 bytes padding: TOTAL = 40
} b;

Related

Memory alignment and padding in c

Consider this code segment
struct {
short x[5];
union {
float y;
long z;
} u;
} t;
Assume that the objects of the type short, float and long occupy 2 bytes, 4 bytes and 8 bytes, respectively. The memory requirement for variable t, Don't ignore the alignment consideration, is:
My attempt without alignment consideration is that struct will reserve 10 bytes for x as each of size is 2 bytes and 8 bytes for long z therefore total would be equal to 18 bytes but I want to know more about what is this alignment?
I want to know about how this memory alignment work
From the C standard:
alignment
requirement that objects of a particular type be located on storage boundaries with
addresses that are particular multiples of a byte address
and further
Alignment of objects
Complete object types have alignment requirements which place restrictions on the
addresses at which objects of that type may be allocated. An alignment is an
implementation-defined integer value representing the number of bytes between
successive addresses at which a given object can be allocated. An object type imposes an
alignment requirement on every object of that type
Notice the part: implementation-defined
So an implementation of the C-standard is allowed to specify restrictions on the addresses where an object of a specific type may be located.
For instance, it could be that float should always be placed at addresses that are multiples of 8, i.e. valid addresses would be X * 8. So 4000, 4008, 4016 would be valid while 4001, 4002, 4003, 4004, 4005, 4006, 4007 would be invalid.
For such implementations padding will be inserted into structs in order to get a valid address.
For your example:
If your compiler requires 8-bytes alignment of long, it will have to insert padding between x and z to make z start at an 8 byte aligned address. The size will then be 24 bytes.
But remember that this is implementation defined.
You can try this program:
#include <stdio.h>
struct {
short x[5];
union {
float y;
long z;
} u;
}t;
int main(void) {
printf("Size of t is %zu\n", sizeof(t));
printf("Size of t.x is %zu\n", sizeof(t.x));
printf("Size of t.u.y is %zu\n", sizeof(t.u.y));
printf("Size of t.u.z is %zu\n", sizeof(t.u.z));
printf("Location of t is %p\n", (void*)&t);
printf("Location of t.x is %p\n", (void*)t.x);
printf("Location of t.y is %p\n", (void*)&t.u.y);
printf("Location of t.z is %p\n", (void*)&t.u.z);
return 0;
}
Possible output:
Size of t is 24
Size of t.x is 10
Size of t.u.y is 4
Size of t.u.z is 8
Location of t is 0x559b60552020
Location of t.x is 0x559b60552020
Location of t.y is 0x559b60552030
Location of t.z is 0x559b60552030
Notice here that the size of t.x is 10 but the address distance between t.x and t.y is 16 (aka 0x10) so there are 6 bytes padding between t.x and t.z.
therefore total would be equal to 18 bytes but i want to know more about what is this alignment?
I am a compiler. I assume from your post that:
type - size and alignment
short - 2 bytes
float - 4 bytes
long - 8 bytes
So I have this code:
struct {
short x[5]; // 2 * 5 = 10 bytes, has to start at address divisible by 2
union {
float y; // 4 bytes, has to start at address divisible by 4
long z; // 8 bytes, has to start at address divisible by 8
} u; // an union, so we take bigger address and alignment..
// so it will have 8 bytes, but it also has to start at address
// divisible by 8
} t; // a structure, I need to take the biggest alignment requirement of members
// so it has to start at address dividable by 8
// and it has at least the size of sum of members
// so at least 10 + 8 bytes + padding
So because _Alignof(long) = 8, I'll make typeof(t) start at address divisible by 8. So I'll look at my linker script and pick... and pick for example memory address 200.
But u needs to start at address divisible by 8. So:
struct { // memory cell 200
short x[5]; // 12 half-words in memory cells 200 - 211
// (211 inclusice, 212 exclusive)
// 212 % 8 = 4, so we need to insert 4 bytes padding here
// so that union will start at address divisible by 8
// padding 4 bytes, memory cells 212 - 215
union { // long-word in memory cells 216 - 223
// 216 is divisible by 8
float y; // word in memory cells 216 - 219
long z; // long-word in memory cells 216 - 223
} u;
} y; // so it has size of 12 + 4 bytes padding + 8 = 24 bytes
Alignment is just that the variable has to start at memory address that is divisible by a number. So you insert padding between members, so that they start at address divisible by a number they need to.

Size of a structure in C [duplicate]

This question already has answers here:
Why isn't sizeof for a struct equal to the sum of sizeof of each member?
(13 answers)
Closed 6 years ago.
I was trying to find out the sizeof a structure, which I thought should show up as 24 bytes on my 64 bit Mac OS, instead it was shown as 32 bytes. what am i missing?
#include<stdio.h>
int main() {
struct Test{
int a;
int *b;
char *c;
float d;
}m;
int size = sizeof(m);
printf("%d\n",size);
}
Any of the field is aligned to its minimal alignment which is 4 for ints and floats and 8 for pointers. There will be padding and memory unused before such alignment. The full structure is aligned to 16 due to SSE requirements:
a: offset 0
b: offset 8 (4 bytes padding before)
c: offset 16
d: offset 24
padding 4 bytes to align to 16 bytes.
Alignment. 4 bytes int, padding 4 bytes, 8 bytes pointer, eight bytes pointer, 4 bytes float, 4 bytes padding.

Memory locations, and ranges

Say I have a struct:
struct guitar{
long guitarID;
short brand:3;
short strings: 6;
short price;
}x[5][5]; //Thanks chux
If the address of x is 0xaaa and memory is aligned at multiples of 4 then what would the address be at x[1]?
The other thing I want to know is what the range of numbers between brand and strings are now that they are affected by a bitfield?
Assuming long 8 bytes, short 2 bytes and memory is 4 bytes aligned size of struct is 8 byte + 3 bit + 6 bit + 2 byte = 8 + 4 = 12 bytes.
x[1] is nothing but &x[1][0].
If x is 0xaaa , x[1] is 0xaaa + (5 * 12) = 0xaaa + 60.
So x[1] is 60 bytes away from x.
Let's try to compute it. The first thing you need to know the size of the struct. Since the size is implementation dependent let's consider a 32-bit machine.
The first member of your structure, guitarID, has 4 bytes. Then, you have 3 bits in brand, 6 bits in strings. These 2, along with padding, make up another 2 bytes. And then, you have another 2 bytes in price. In total, your structure occupies 8 bytes.
Now, let's see how your array is stored. You have a matrix of 5 by 5 elements. In memory, it is stored linearly, like this:
x[0][0] x[0][1] x[0][2] x[0][3] x[0][4] x[1][0] x[1][1] x[1][2] x[1][3]
and so on. I don't know exactly what you mean by x[1], but I assume that you're interested in the address of x[1][0]. You can see that it has before it 5 elements, which means that is has an address with 5 * 8 = 40 bytes higher than the address of the first element. I cannot give you an absolute address as an answer, since 0xaaa which you mentioned as an address for the first element is not word aligned.

memory allocation for structures elements

Hi I am having difficulties in understanding about how the memory is allocated to the structure elements.
For example if i have the below structure and the size of char is 1 and int is 4 bytes respectively.
struct temp
{
char a;
int b;
};
I am aware that the size of the structure would be 8. Because there will be a padding of 3 bytes after the char, and the next element should be placed in multiple of 4 so the size will be 8.
Now consider the below structure.
struct temp
{
int a; // size is 4
double b; // size is 8
char c; // size is 4
double d; // size is 8
int e; // size is 4
};
This is the o/p i got for the above strucure
size of node is 40
the address of node is 3392515152 ( =: base)
the address of a in node is 3392515152 (base + 0)
the address of b in node is 3392515160 (base + 8)
the address of c in node is 3392515168 (base + 16)
the address of d in node is 3392515176 (base + 24)
the address of e in node is 3392515184 (base + 32)
The total memory sum up to 36 bytes, why does it show as 40 bytes?
If we create an array of such structure also the first element of the next array element can be place in 3392515188 (base + 36) as it is a multiple of 4, but why is it not happening this way?
Can any one plz solve my doubt.
Thanks in advance,
Saravana
It seems that on your system, double has to have the alignment of 8.
struct temp {
int a; // size is 4
// padding 4 bytes
double b; // size is 8
char c; // size is 1
// padding 7 bytes
double d; // size is 8
int e; // size is 4
// padding 4 bytes
};
// Total 4+4+8+1+7+8+4+4 = 40 bytes
Compiler adds an extra 4 bytes to the end of struct to make sure that array[1].b will be properly aligned.
Without end padding (assuming array is at address 0):
&array[0] == 0
&array[1] == 36
&array[1].b == 36 + 8 == 44
44 % 8 == 4 -> ERROR, not aligned!
With end padding (assuming array is at address 0):
&array[0] == 0
&array[1] == 40
&array[1].b == 40 + 8 == 48
48 % 8 == 0 -> OK!
Note that sizes, alignments, and paddings depend on target system and compiler in use.
In your calculation, you ignore the fact that e is subject to be padded as well:
The struct looks like
0 8 16 24 32
AAAAaaaaBBBBBBBBCcccccccDDDDDDDDEEEEeeee
where uppercase is the variable itself, and lowercase is the padding applied to it.
As you see (and as well from the addresses), each field is padded to 8 bytes, which is the largest field in the structure.
As the structure might be used in an array, and all array elements should be well-aligned as well, the padding to e is necessary.
It's heavily dependent on both your processor architecture and compiler. Modern machines and compilers may choose larger or smaller padding to reduce the access cost to data.
Four-byte alignment means that two address lines are unused. Eight, three. A chip can use that to address more memory (coarser grain) with the same amount of hardware.
A compiler might use a similar trick for various reasons, but no compiler is required to do anything but be no less fine-grained than the processor. Often, they'll just take the biggest-size value and use it exclusively for that block. In your case, that's a double, which is eight bytes.
This is a compiler dependent behavior.
Some compiler makes that 'double' to be stored after 8 bit offset.
IF you modify the structure as below you will get different result.
struct temp
{
double b; // size is 8
int a; // size is 4
int e; // size is 4
double d; // size is 8
char c; // size is 4
}
Every programmer should know what padding you compiler is doing.
E.g. If you are working on ARM platform and you set compiler settings to do not pad structure elements[ then accessing structure elements through pointers may generate 'odd' address for which processor generates an exception.
Every structure will also have alignment requirements
for example :
typedef struct structc_tag
{
char c;``
double d;
int s;
} structc_t;
Applying same analysis, structc_t needs sizeof(char) + 7 byte padding + sizeof(double) + sizeof(int) = 1 + 7 + 8 + 4 = 20 bytes. However, the sizeof(structc_t) will be 24 bytes. It is because, along with structure members, structure type variables will also have natural alignment. Let us understand it by an example. Say, we declared an array of structc_t as shown below structc_t structc_array[3];
Assume, the base address of structc_array is 0×0000 for easy calculations. If the structc_t occupies 20 (0×14) bytes as we calculated, the second structc_t array element (indexed at 1) will be at 0×0000 + 0×0014 = 0×0014. It is the start address of index 1 element of array. The double member of this structc_t will be allocated on 0×0014 + 0×1 + 0×7 = 0x001C (decimal 28) which is not multiple of 8 and conflicting with the alignment requirements of double. As we mentioned on the top, the alignment requirement of double is 8 bytes. In order to avoid such misalignment, compiler will introduce alignment requirement to every structure. It will be as that of the largest member of the structure. In our case alignment of structa_t is 2, structb_t is 4 and structc_t is 8. If we need nested structures, the size of largest inner structure will be the alignment of immediate larger structure.
In structc_t of the above program, there will be padding of 4 bytes after int member to make the structure size multiple of its alignment. Thus the sizeof (structc_t) is 24 bytes. It guarantees correct alignment even in arrays. You can cross check
to avoid structure padding!
#pragma pack ( 1 ) directive can be used for arranging memory for structure members very next to the end of other structure members.
#pragma pack(1)
struct temp
{
int a; // size is 4
int b; // size is 4
double s; // size is 8
char ch; //size is 1
};
size of structure would be:17
If we create an array of such structure also the first element of the next array element can be place in 3392515188 (base + 36) as it is a multiple of 4, but why is it not happening this way?
It can't because of the double elements in there.
It's clear that the compiler and architecture you are using requires a double to be eight byte aligned. This is obvious because there is seven bytes of padding after the char c.
This requirement also means that the entire struct must be eight byte aligned. There's no point in carefully making all the doubles aligned to eight bytes relative to the start of the struct if the struct itself is only four byte aligned. Hence the padding after the final int to make sizeof(temp) a multiple of eight.
Note that this alignment requirement need not be a hard requirement. The compiler could choose to do the alignment even if doubles can be four byte aligned on the grounds that it might take more memory cycles to access the double if it's only four byte aligned.

Confusion in Structure Member Alignment

typedef struct structc_tag
{
char c;
double d;
int s;
} structc_t;
I read in a blog that this will take 24 bytes of data:
sizeof(char) + 7 byte padding + sizeof(double) + sizeof(int) + 4 byte padding = 1 + 7 + 8 + 4 + 4 = 24 bytes.
My question is why the 7 byte padding, why can't we use 3bytes of padding there and utilise next 8 bytes for double? And what is the need for last 4 bytes?
You need to consider what the happens if you allocate an array of these structures with malloc():
structc_t *p = malloc(2 * sizeof *p);
Consider a platform where sizeof(double) == 8, sizeof(int) == 4 and the required alignment of double is 8. malloc() always returns an address correctly aligned for storing any C type - so in this case a will be 8 byte aligned. The padding requirements then naturally fall out:
In order for a[0].d to be 8-byte aligned, there must therefore be 7 bytes of padding after a[0].c;
In order for a[1].d to be 8-byte aligned, the overall struct size must be a multiple of 8, so there must therefore be 4 bytes of padding after a[0].s.
If you re-order the struct from largest to smallest:
typedef struct structc_tag
{
double d;
int s;
char c;
} structc_t;
...then the only padding required is 3 bytes after .c, to make the structure size a multiple of 8. This results in the total size of the struct being 16, rather than 24.
It's platform dependent, but it depends on what double is aligned to. If it's aligned to 8 bytes, which appears to be this case, 3 bytes of padding won't cut it.
If double was aligned to 4 bytes, you'd be right and 3 bytes of padding would be used.

Resources