Sizeof a struct in C - c

Well, after reading this Size of structure with a char, a double, an int and a t I still don't get the size of my struct which is :
struct s {
char c1[3];
long long k;
char c2;
char *pt;
char c3;
}
And sizeof(struct s) returns me 40
But according to the post I mentioned, I thought that the memory should like this way:
0 1 2 3 4 5 6 7 8 9 a b c d e f
+-------------+- -+---------------------------+- - - - - - - -+
| c1 | |k | |
+-------------+- -+---------------------------+- - - - - - - -+
10 11 12 13 14 15 16 17
+---+- -+- -+- - - - - -+----+
|c2 | |pt | | c3 |
+---+- -+- -+- - - - - -+----+
And I should get 18 instead of 40...
Can someone explain to me what I am doing wrong ? Thank you very much !

Assuming an 8-byte pointer size and alignment requirement on long long and pointers, then:
3 bytes for c1
5 bytes padding
8 bytes for k
1 byte for c2
7 bytes padding
8 bytes for pt
1 byte for c3
7 bytes padding
That adds up to 40 bytes.
The trailing padding is allocated so that arrays of the structure keep all the elements of the structure properly aligned.
Note that the sizes, alignment requirements and therefore padding depend on the machine hardware, the compiler, and the platform's ABI (Application Binary Interface). The rules I used are common rules: an N-byte type (for N in {1, 2, 4, 8, 16 }) needs to be allocated on an N-byte boundary. Arrays (both within the structure and arrays of the structure) also need to be properly aligned. You can sometimes dink with the padding with #pragma directives; be cautious. It is usually better to lay out the structure with the most stringently aligned objects at the start and the less stringently aligned ones at the end.
If you used:
struct s2 {
long long k;
char *pt;
char c1[3];
char c2;
char c3;
};
the size required would be just 24 bytes, with just 3 bytes of trailing padding. Order does matter!

The size of the structure depends upon what compiler is used and what compiler options are enabled. The C language standard makes no promises about how memory is utilized when the compiler creates structures, and different architectures (for example 32-bit WinTel vs 64-bit WinTel) cause different layout decisions even when the same compiler is used.
Essentially, the size of a structure is equal to the sum of the size of the bytes needed by the field elements (which can generally be calculated) plus the sum of the padding bytes injected by the compiler (which is generally not known).

It is because of alignment, gcc has
#pragma pack(push,n)
// declare your struct here
#pragma pack(pop)
to change it. Read here, and also __attribute__((__packed__)).
If you declare the struct
struct packed
{
char c1[3];
long long k;
char c2;
char *pt;
char c3;
} __attribute__((__packed__));
then compiling with gcc, sizeof(packed) = 18 since
c1: 3
k : 8
c2: 1
pt: 4 // it depends
c3: 1
Apparently Visual C++ compiler supports #pragma pack(push,n) too.

what is a size of structure?
#include <stdio.h>
struct {
char a;
char b;
char c;
}st;
int main()
{
printf("%ld", sizeof(st));
return 0;
}
it shows 3 in gdb compiler.

Related

Structures padding in C [duplicate]

This question already has answers here:
Structure padding in C
(6 answers)
Closed 1 year ago.
int is 4 bytes on my machine, long 8 bytes, etc.
Hey, so I've encountered a pretty interesting thing in C and started wondering how structures manage their data inside. I thought it works like an array, but oh boy, I was wrong. So basically, I thought that the data inside sums up itself, but I've found out on stack overflow, that some compilers might do some optimizations due to processor's architecture requirements. And there come alignments. I've found two links about alignments, and I've wanted to calculate my struct's size and I've experimented a bit, but I think I understand that in some ways, and in some not. That's why I wanted to create that topic, since I couldn't fully grasp some of the examples provided by people who were answering in those topics. For example:
#include <stdio.h>
struct test {
char a;
char b;
int c;
long d;
int e;
};
int main(void){
printf("test = %d\n", sizeof(test));
return 0;
}
Output:
test = 24
I was expecting the compiler to do an optimization like this:
char a is 1 byte, char b is 1 byte, thus we don't need to align. char b is 1 byte, int c is 4 bytes, thus we need to align 3 bytes. int c is 4 bytes, long d is 8 bytes, thus we need to align 4 bytes. long d is 8 bytes, int e is 4 bytes, thus we need to align 4 bytes. And till this point the total size is 29. Rounding it with ceiling to the nearest even number gives 30. Why it is 24 then?
I've also found out that the char a + char b give a padding equal to 2 bytes, so we only need to align 2 more bytes, thus maybe that's where I'm making a mistake. Also if I add more variables:
#include <stdio.h>
struct test {
char a;
char b;
int c;
long d;
int e;
char f;
char g;
char h;
char i;
};
int main(void){
printf("test = %d\n", sizeof(test));
return 0;
}
Output:
test = 24
The total size is still 24 bytes. But if I add one more variable:
#include <stdio.h>
struct test {
char a;
char b;
int c;
long d;
int e;
char f;
char g;
char h;
char i;
char j;
};
int main(void){
printf("test = %d\n", sizeof(test));
return 0;
}
Output:
test = 32
The size changes to total of 32 bytes. Why? What exactly happens? Sorry if an answer for that question is pretty obvious for you, but I truly don't understand. Also I don't know if that differs between compilers, so if I didn't provide some information, just tell me and I will add that.
It all comes down to alignment. The compiler wants to keep each element aligned to an address that's a multiple of that item's size, because the hardware can access it most efficiently that way. (And on some architectures, the hardware can only access it that way); unaligned access are disallowed.)
You've got one element in your structure that's a long int of size 8, so its alignment is going to drive everything else. Here's how your first structure would be laid out:
0 1 2 3 4 5 6 7
+---+---+---+---+---+---+---+---+
0 | a | b | pad | c |
+---+---+---+---+---+---+---+---+
8 | d |
+---+---+---+---+---+---+---+---+
16 | e | padding |
+---+---+---+---+---+---+---+---+
So, as you can see, the size is 24, including two invisible, unnamed "padding" fields of 2 and 4 bytes, respectively.
Structure padding and alignment can be confusing. (It took me an embarrassingly large number of tries to get this answer right.) Fortunately, you usually don't have to worry about any of this, because it's the compiler's problem, not yours.
You can get the compiler to tell you how it's laying a structure out by using the offsetof macro:
int main(void){
printf("a # %zd\n", offsetof(struct test, a));
printf("b # %zd\n", offsetof(struct test, b));
printf("c # %zd\n", offsetof(struct test, c));
printf("d # %zd\n", offsetof(struct test, d));
printf("e # %zd\n", offsetof(struct test, e));
printf("size = %zd\n", sizeof(struct test));
return 0;
}
On my machine (which seems to be behaving the same as yours) this prints:
a # 0
b # 1
c # 4
d # 8
e # 16
size = 24
Notice that I have used %zd instead of %d, since sizeof and offsetof give their answers as type size_t, not int.
When you added char fields f, g, h, and i, they could fit into the second padding space, without making the overall structure any bigger. It was only when you added j that it pushed things over into another 8-byte chunk:
0 1 2 3 4 5 6 7
+---+---+---+---+---+---+---+---+
0 | a | b | pad | c |
+---+---+---+---+---+---+---+---+
8 | d |
+---+---+---+---+---+---+---+---+
16 | e | f | g | h | i |
+---+---+---+---+---+---+---+---+
24 | j | padding |
+---+---+---+---+---+---+---+---+

C Sizeof char[] in struct [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 3 years ago.
Improve this question
I know what padding is and how alignment works. Given the struct below:
typedef struct {
char word[10];
short a;
int b;
} Test;
I don't understand how C interprets and aligns the char array inside the struct. It should be 9 chars + terminator and it should be regarded as the longest like this:
| - _ - _ - _ - _ - word - _ - _ - _ - _ - |
| - a - | - _ - b - _ - | padding the remaining 4 bytes
The "-" represents a byte and "_" separates the bytes. So we have the 10 bytes long word, the 2 bytes long a and the 4 bytes long b and padding of 4 bytes. But when I print sizeof(Test) it returns 16.
EDIT: I got it.
In a struct like
struct {
char word[10];
short a;
int b;
}
you have the following requirements:
a needs an even offset. As the char arry before it has an even length, there is no need for padding. So a sits at offset 10.
b needs an offset which is dividible by 4. 12 is dividible by 4, so 12 is a fine offset for b.
The whole struct needs a size which is dividible by 4, because every b in an array of this struct needs to have the said requirement. But as we are currently at size 16, we don't need any padding.
WWWWWWWWWWAABBBB
|-- 10 --| 2 4 = 16
Compare this with
struct {
char word[11];
short a;
int b;
}
Here, a would have offset 11. This is not allowed, thus padding is inserted. a is fine with an offset of 12.
b would then get an offset of 14, which isn't allowed either, so 2 bytes are added. b gets an offset of 16. The whole struct gets a size of 20, which is fine for all subsequent items in an array.
WWWWWWWWWWW.AA..BBBB
|-- 11 --|1 2 2 4 = 20
Third example:
struct {
char word[11];
int b;
short a;
}
(note the changed order!)
b is happy with an offset of 12 (it gets 1 padding byte),
a is happy with an offset of 16. (no padding before it.)
After the struct, however, 2 bytes of padding are added so that the struct aligns with 4.
WWWWWWWWWW..BBBBAA..
|-- 10 --| 2 4 2 2 = 20
In:
struct
{
char word[10];
short a;
int b;
}
and given two-byte short and four-byte int, the structure is laid out in memory:
Offset Member
0 word[0]
1 word[1]
2 word[2]
3 word[3]
4 word[4]
5 word[5]
6 word[6]
7 word[7]
8 word[8]
9 word[9]
10 a
11 a
12 b
13 b
14 b
15 b
To get the layout described in the question, where a and b overlap word, you need to use a struct inside a union:
typedef union
{
char word[10];
struct { short a; int b; };
} Test;
Generally, each variable will be aligned on a boundary of its size.
(unless attributes such as packed are applied)
A complete discussion is on Wikipedia, which says in part:
A char (one byte) will be 1-byte aligned.
A short (two bytes) will be 2-byte aligned.
An int (four bytes) will be 4-byte aligned.
A long (four bytes) will be 4-byte aligned.
A float (four bytes) will be 4-byte aligned.
A double (eight bytes) will be 8-byte aligned on Windows and 4-byte aligned on
Linux (8-byte with -malign-double compile time option).
A long long (eight bytes) will be 4-byte aligned.
So your structure is laid out as:
typedef struct {
char word[10];
// Aligned with beginning of structure; takes bytes 0-9
short a;
// (assuming short is 2-bytes)
// Previous member ends on byte 9, this one starts on byte-10.
// Byte 10 is a multiple of 2, so no padding necessary
// Takes bytes 10 and 11
int b;
// Previous member ends on byte 11, next byte is 12, which is a multiple of 4.
// No padding necessary
// Takes bytes 12, 13, 14, 15.
} Test;
Total size: 16 bytes.
If you want to play with it, change your word-array to 9 or 11 bytes,
or reverse the order of your short and int, and you'll see the size of the structure change.

different between C struct bitfields on char and on int

When using bitfields in C, I found out differences I did not expect related to the actual type that is used to declare the fields.
I didn't find any clear explanation. Now, the problem is identified, so if though there is no clear response, this post may be useful to anyone facing the same issue.
Still if some can point to a formal explanation, this coudl be great.
The following structure, takes 2 bytes in memory.
struct {
char field0 : 1; // 1 bit - bit 0
char field1 : 2; // 2 bits - bits 2 down to 1
char field2 ; // 8 bits - bits 15 down to 8
} reg0;
This one takes 4 bytes in memory, the question is why ?
struct {
int field0 : 1; // 1 bit - bit 0
int field1 : 2; // 2 bits - bits 2 down to 1
char field2 ; // 8 bits - bits 15 down to 8
} reg1;
In both cases, the bits are organized in memory in the same way: field 2 is always taking bits 15 down to 8.
I tried to find some literarure on the subject, but still can't get a clear explanation.
The two most appropriate links I can found are:
http://www.catb.org/esr/structure-packing/
http://www.msg.ucsf.edu/local/programs/IBM_Compilers/C:C++/html/language/ref/clrc03defbitf.htm
However, none really explains really why the second structure is taking 4 bytes. Actually reading carefully, I would even expect the structure to take 2 bytes.
In both cases,
field0 takes 1 bit
field1 takes 2 bits
field2 takes 8 bits, and is aligned on the first available byte address
Hence, the useful data requires 2 bytes in both cases.
So what is behind the scene that makes reg1 to take 4 bytes ?
Full Code Example:
#include "stdio.h"
// Register Structure using char
typedef struct {
// Reg0
struct _reg0_bitfieldsA {
char field0 : 1;
char field1 : 2;
char field2 ;
} reg0;
// Nextreg
char NextReg;
} regfileA_t;
// Register Structure using int
typedef struct {
// Reg1
struct _reg1_bitfieldsB {
int field0 : 1;
int field1 : 2;
char field2 ;
} reg1;
// Reg
char NextReg;
} regfileB_t;
regfileA_t regsA;
regfileB_t regsB;
int main(int argc, char const *argv[])
{
int* ptrA, *ptrB;
printf("sizeof(regsA) == %-0d\n",sizeof(regsA)); // prints 3 - as expected
printf("sizeof(regsB) == %-0d\n",sizeof(regsB)); // prints 8 - why ?
printf("\n");
printf("sizeof(regsA.reg0) == %-0d\n",sizeof(regsA.reg0)); // prints 2 - as epxected
printf("sizeof(regsB.reg0) == %-0d\n",sizeof(regsB.reg1)); // prints 4 - int bit fields tells the struct to use 4 bytes then.
printf("\n");
printf("addrof(regsA.reg0) == 0x%08x\n",(int)(&regsA.reg0)); // 0x0804A028
printf("addrof(regsA.reg1) == 0x%08x\n",(int)(&regsA.NextReg)); // 0x0804A02A = prev + 2
printf("addrof(regsB.reg0) == 0x%08x\n",(int)(&regsB.reg1)); // 0x0804A020
printf("addrof(regsB.reg1) == 0x%08x\n",(int)(&regsB.NextReg)); // 0x0804A024 = prev + 4 - my register is not at the righ place then.
printf("\n");
regsA.reg0.field0 = 1;
regsA.reg0.field1 = 3;
regsA.reg0.field2 = 0xAB;
regsB.reg1.field0 = 1;
regsB.reg1.field1 = 3;
regsB.reg1.field2 = 0xAB;
ptrA = (int*)&regsA;
ptrB = (int*)&regsB;
printf("regsA.reg0.value == 0x%08x\n",(int)(*ptrA)); // 0x0000AB07 (expected)
printf("regsB.reg0.value == 0x%08x\n",(int)(*ptrB)); // 0x0000AB07 (expected)
return 0;
}
When I first write the struct I expected to get reg1 to take only 2 bytes, hence the next register was at the offset = 2.
The relevant part of the standard is C11/C17 6.7.2.1p11:
An implementation may allocate any addressable storage unit large enough to hold a bit-field. If enough space remains, a bit-field that immediately follows another bit-field in a structure shall be packed into adjacent bits of the same unit. If insufficient space remains, whether a bit-field that does not fit is put into the next unit or overlaps adjacent units is implementation-defined. The order of allocation of bit-fields within a unit (high-order to low-order or low-order to high-order) is implementation-defined. The alignment of the addressable storage unit is unspecified.
that, in connection with C11/C17 6.7.2.1p5
A. bit-field shall have a type that is a qualified or unqualified version of _Bool, signed int, unsigned int, or some other implementation-defined type. It is implementation-defined whether atomic types are permitted.
and that you're using char means that there is nothing to discuss in general - for a specific implementation check the compiler manuals. Here's the one for GCC.
From the 2 excerpts it follows that an implementation is free to use absolutely whatever types it wants to to implement the bitfields - it could even use int64_t for both of these cases having the structure of size 16 bytes. The only thing a conforming implementation must do is to pack the bits within the chosen addressable storage unit if enough space remains.
For GCC on System-V ABI on 386-compatible (32-bit processors), the following stands:
Plain bit-fields (that is, those neither signed nor unsigned) always have non- negative values. Although they may have type char, short, int, long, (which can have negative values),
these bit-fields have the same range as a bit-field of the same size
with the corresponding unsigned type. Bit-fields obey the same
size and alignment rules as other structure and union members, with
the following additions:
Bit-fields are allocated from right to left (least to most significant).
A bit-field must entirely reside in a storage unit appropriate for its declared type. Thus a bit-field never crosses its unit boundary.
Bit-fields may share a storage unit with other struct/union members, including members that are not bit-fields. Of course,
struct members occupy different parts of the storage unit.
Unnamed bit-fields' types do not affect the alignment of a structure or union, although individual bit-fields' member offsets obey the
alignment constraints.
i.e. in System-V ABI, 386, int f: 1 says that the bit-field f must be within an int. If entire bytes of space remains, a following char within the same struct will be packed inside this int, even if it is not a bit-field.
Using this knowledge, the layout for
struct {
int a : 1; // 1 bit - bit 0
int b : 2; // 2 bits - bits 2 down to 1
char c ; // 8 bits - bits 15 down to 8
} reg1;
will be
1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
|a b b x x x x x|c c c c c c c c|x x x x x x x x|x x x x x x x x|
<------------------------------ int ---------------------------->
and the layout for
struct {
char a : 1; // 1 bit - bit 0
char b : 2; // 2 bits - bits 2 down to 1
char c ; // 8 bits - bits 15 down to 8
} reg1;
will be
1
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5
|a b b x x x x x|c c c c c c c c|
<---- char ----><---- char ---->
So there are tricky edge cases. Compare the 2 definitions here:
struct x {
short a : 2;
short b : 15;
char c ;
};
struct y {
int a : 2;
int b : 15;
char c ;
};
Because the bit-field must not cross the unit boundary, the struct x members a and b need to go to different shorts. Then there is not enough space to accommodate the char c, so it must come after that. And the entire struct must be suitably aligned for short so it will be 6 bytes on i386. The latter however, will pack a and b in the 17 lowest bits of the int, and since there is still one entire addressable byte left within the int, the c will be packed here too, and hence sizeof (struct y) will be 4.
Finally, you must really specify whether the int or char is signed or not - the default might be not what you expect! Standard leaves it up to the implementation, and GCC has a compile-time switch to change them.

memory allocation for structures elements

Hi I am having difficulties in understanding about how the memory is allocated to the structure elements.
For example if i have the below structure and the size of char is 1 and int is 4 bytes respectively.
struct temp
{
char a;
int b;
};
I am aware that the size of the structure would be 8. Because there will be a padding of 3 bytes after the char, and the next element should be placed in multiple of 4 so the size will be 8.
Now consider the below structure.
struct temp
{
int a; // size is 4
double b; // size is 8
char c; // size is 4
double d; // size is 8
int e; // size is 4
};
This is the o/p i got for the above strucure
size of node is 40
the address of node is 3392515152 ( =: base)
the address of a in node is 3392515152 (base + 0)
the address of b in node is 3392515160 (base + 8)
the address of c in node is 3392515168 (base + 16)
the address of d in node is 3392515176 (base + 24)
the address of e in node is 3392515184 (base + 32)
The total memory sum up to 36 bytes, why does it show as 40 bytes?
If we create an array of such structure also the first element of the next array element can be place in 3392515188 (base + 36) as it is a multiple of 4, but why is it not happening this way?
Can any one plz solve my doubt.
Thanks in advance,
Saravana
It seems that on your system, double has to have the alignment of 8.
struct temp {
int a; // size is 4
// padding 4 bytes
double b; // size is 8
char c; // size is 1
// padding 7 bytes
double d; // size is 8
int e; // size is 4
// padding 4 bytes
};
// Total 4+4+8+1+7+8+4+4 = 40 bytes
Compiler adds an extra 4 bytes to the end of struct to make sure that array[1].b will be properly aligned.
Without end padding (assuming array is at address 0):
&array[0] == 0
&array[1] == 36
&array[1].b == 36 + 8 == 44
44 % 8 == 4 -> ERROR, not aligned!
With end padding (assuming array is at address 0):
&array[0] == 0
&array[1] == 40
&array[1].b == 40 + 8 == 48
48 % 8 == 0 -> OK!
Note that sizes, alignments, and paddings depend on target system and compiler in use.
In your calculation, you ignore the fact that e is subject to be padded as well:
The struct looks like
0 8 16 24 32
AAAAaaaaBBBBBBBBCcccccccDDDDDDDDEEEEeeee
where uppercase is the variable itself, and lowercase is the padding applied to it.
As you see (and as well from the addresses), each field is padded to 8 bytes, which is the largest field in the structure.
As the structure might be used in an array, and all array elements should be well-aligned as well, the padding to e is necessary.
It's heavily dependent on both your processor architecture and compiler. Modern machines and compilers may choose larger or smaller padding to reduce the access cost to data.
Four-byte alignment means that two address lines are unused. Eight, three. A chip can use that to address more memory (coarser grain) with the same amount of hardware.
A compiler might use a similar trick for various reasons, but no compiler is required to do anything but be no less fine-grained than the processor. Often, they'll just take the biggest-size value and use it exclusively for that block. In your case, that's a double, which is eight bytes.
This is a compiler dependent behavior.
Some compiler makes that 'double' to be stored after 8 bit offset.
IF you modify the structure as below you will get different result.
struct temp
{
double b; // size is 8
int a; // size is 4
int e; // size is 4
double d; // size is 8
char c; // size is 4
}
Every programmer should know what padding you compiler is doing.
E.g. If you are working on ARM platform and you set compiler settings to do not pad structure elements[ then accessing structure elements through pointers may generate 'odd' address for which processor generates an exception.
Every structure will also have alignment requirements
for example :
typedef struct structc_tag
{
char c;``
double d;
int s;
} structc_t;
Applying same analysis, structc_t needs sizeof(char) + 7 byte padding + sizeof(double) + sizeof(int) = 1 + 7 + 8 + 4 = 20 bytes. However, the sizeof(structc_t) will be 24 bytes. It is because, along with structure members, structure type variables will also have natural alignment. Let us understand it by an example. Say, we declared an array of structc_t as shown below structc_t structc_array[3];
Assume, the base address of structc_array is 0×0000 for easy calculations. If the structc_t occupies 20 (0×14) bytes as we calculated, the second structc_t array element (indexed at 1) will be at 0×0000 + 0×0014 = 0×0014. It is the start address of index 1 element of array. The double member of this structc_t will be allocated on 0×0014 + 0×1 + 0×7 = 0x001C (decimal 28) which is not multiple of 8 and conflicting with the alignment requirements of double. As we mentioned on the top, the alignment requirement of double is 8 bytes. In order to avoid such misalignment, compiler will introduce alignment requirement to every structure. It will be as that of the largest member of the structure. In our case alignment of structa_t is 2, structb_t is 4 and structc_t is 8. If we need nested structures, the size of largest inner structure will be the alignment of immediate larger structure.
In structc_t of the above program, there will be padding of 4 bytes after int member to make the structure size multiple of its alignment. Thus the sizeof (structc_t) is 24 bytes. It guarantees correct alignment even in arrays. You can cross check
to avoid structure padding!
#pragma pack ( 1 ) directive can be used for arranging memory for structure members very next to the end of other structure members.
#pragma pack(1)
struct temp
{
int a; // size is 4
int b; // size is 4
double s; // size is 8
char ch; //size is 1
};
size of structure would be:17
If we create an array of such structure also the first element of the next array element can be place in 3392515188 (base + 36) as it is a multiple of 4, but why is it not happening this way?
It can't because of the double elements in there.
It's clear that the compiler and architecture you are using requires a double to be eight byte aligned. This is obvious because there is seven bytes of padding after the char c.
This requirement also means that the entire struct must be eight byte aligned. There's no point in carefully making all the doubles aligned to eight bytes relative to the start of the struct if the struct itself is only four byte aligned. Hence the padding after the final int to make sizeof(temp) a multiple of eight.
Note that this alignment requirement need not be a hard requirement. The compiler could choose to do the alignment even if doubles can be four byte aligned on the grounds that it might take more memory cycles to access the double if it's only four byte aligned.

Structs in a 32-bit architecture [duplicate]

This question already has answers here:
What is the meaning of "__attribute__((packed, aligned(4))) "
(3 answers)
Closed 9 years ago.
The following code;
struct s1 {
void *a;
char b[2];
int c;
};
struct s2 {
void *a;
char b[2];
int c;
}__attribute__((packed));
if s1 has a size of 12 bytes and s2 has a size of 10 bytes, is this due to data being read in 4 byte chunks and }__attribute__((packed)); reduces the size of void*a; to only 2 bytes?
A little confused as to what }__attribute__((packed)); does.
Many thanks
It is due to alignment, a process in which the compiler adds hidden "junk" between the fields to make sure they have optimal (for performance) starting addresses.
Using packed forces the compiler to not do that, which often means that accessing the structure becomes slower (or simply impossible, causing e.g. a bus error) if the hardware has problems doing e.g. 32-bit accesses on addresses that are not multiples of 4.
On Intel processors, the fetches of 32-bit aligned data is considerably faster than unaligned; on many other processors unaligned fetches might be illegal altogether, or need to be simulated using 2 instructions. Thus the first structure would have the c always on these 32-bit architectures aligned to a byte address divisible by 4. This however requires that 2 bytes will be wasted in storage.
struct s1 {
void *a;
char b[2];
int c;
};
// Byte layout in memory (32-bit little-endian):
// | a0 | a1 | a2 | a3 | b0 | b1 | NA | NA | c0 | c1 | c2 | c3 |
// addresses increasing ====>
On the other hand, sometimes you absolutely need to map some unaligned datastructures (like file formats, or network packets), as is, into C structures; there you can use the __attribute__((packed)) to specify that you want everything without padding bytes:
struct s2 {
void *a;
char b[2];
int c;
} __attribute__((packed));
// Byte layout in memory (32-bit little-endian):
// | a0 | a1 | a2 | a3 | b0 | b1 | c0 | c1 | c2 | c3 |
// addresses increasing ====>
This is due to data structure alignment, a combination of two processes: data alignment and data padding. The first structure will be aligned to the word as you said, however the second structure is packed and forces the compiler to not pad the structure to the word.
The second structure is 10 bytes because the character array is 2 bytes, not the void pointer (it remains 4 bytes, as all pointers are). This can hinder performance as the trade off of 2 bytes of space is not worth the efficiency lost by the hardware (under most circumstances) and could lead to undefined behaviour.

Resources