This question already has answers here:
Structure padding in C
(6 answers)
Closed 1 year ago.
int is 4 bytes on my machine, long 8 bytes, etc.
Hey, so I've encountered a pretty interesting thing in C and started wondering how structures manage their data inside. I thought it works like an array, but oh boy, I was wrong. So basically, I thought that the data inside sums up itself, but I've found out on stack overflow, that some compilers might do some optimizations due to processor's architecture requirements. And there come alignments. I've found two links about alignments, and I've wanted to calculate my struct's size and I've experimented a bit, but I think I understand that in some ways, and in some not. That's why I wanted to create that topic, since I couldn't fully grasp some of the examples provided by people who were answering in those topics. For example:
#include <stdio.h>
struct test {
char a;
char b;
int c;
long d;
int e;
};
int main(void){
printf("test = %d\n", sizeof(test));
return 0;
}
Output:
test = 24
I was expecting the compiler to do an optimization like this:
char a is 1 byte, char b is 1 byte, thus we don't need to align. char b is 1 byte, int c is 4 bytes, thus we need to align 3 bytes. int c is 4 bytes, long d is 8 bytes, thus we need to align 4 bytes. long d is 8 bytes, int e is 4 bytes, thus we need to align 4 bytes. And till this point the total size is 29. Rounding it with ceiling to the nearest even number gives 30. Why it is 24 then?
I've also found out that the char a + char b give a padding equal to 2 bytes, so we only need to align 2 more bytes, thus maybe that's where I'm making a mistake. Also if I add more variables:
#include <stdio.h>
struct test {
char a;
char b;
int c;
long d;
int e;
char f;
char g;
char h;
char i;
};
int main(void){
printf("test = %d\n", sizeof(test));
return 0;
}
Output:
test = 24
The total size is still 24 bytes. But if I add one more variable:
#include <stdio.h>
struct test {
char a;
char b;
int c;
long d;
int e;
char f;
char g;
char h;
char i;
char j;
};
int main(void){
printf("test = %d\n", sizeof(test));
return 0;
}
Output:
test = 32
The size changes to total of 32 bytes. Why? What exactly happens? Sorry if an answer for that question is pretty obvious for you, but I truly don't understand. Also I don't know if that differs between compilers, so if I didn't provide some information, just tell me and I will add that.
It all comes down to alignment. The compiler wants to keep each element aligned to an address that's a multiple of that item's size, because the hardware can access it most efficiently that way. (And on some architectures, the hardware can only access it that way); unaligned access are disallowed.)
You've got one element in your structure that's a long int of size 8, so its alignment is going to drive everything else. Here's how your first structure would be laid out:
0 1 2 3 4 5 6 7
+---+---+---+---+---+---+---+---+
0 | a | b | pad | c |
+---+---+---+---+---+---+---+---+
8 | d |
+---+---+---+---+---+---+---+---+
16 | e | padding |
+---+---+---+---+---+---+---+---+
So, as you can see, the size is 24, including two invisible, unnamed "padding" fields of 2 and 4 bytes, respectively.
Structure padding and alignment can be confusing. (It took me an embarrassingly large number of tries to get this answer right.) Fortunately, you usually don't have to worry about any of this, because it's the compiler's problem, not yours.
You can get the compiler to tell you how it's laying a structure out by using the offsetof macro:
int main(void){
printf("a # %zd\n", offsetof(struct test, a));
printf("b # %zd\n", offsetof(struct test, b));
printf("c # %zd\n", offsetof(struct test, c));
printf("d # %zd\n", offsetof(struct test, d));
printf("e # %zd\n", offsetof(struct test, e));
printf("size = %zd\n", sizeof(struct test));
return 0;
}
On my machine (which seems to be behaving the same as yours) this prints:
a # 0
b # 1
c # 4
d # 8
e # 16
size = 24
Notice that I have used %zd instead of %d, since sizeof and offsetof give their answers as type size_t, not int.
When you added char fields f, g, h, and i, they could fit into the second padding space, without making the overall structure any bigger. It was only when you added j that it pushed things over into another 8-byte chunk:
0 1 2 3 4 5 6 7
+---+---+---+---+---+---+---+---+
0 | a | b | pad | c |
+---+---+---+---+---+---+---+---+
8 | d |
+---+---+---+---+---+---+---+---+
16 | e | f | g | h | i |
+---+---+---+---+---+---+---+---+
24 | j | padding |
+---+---+---+---+---+---+---+---+
Related
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 3 years ago.
Improve this question
I know what padding is and how alignment works. Given the struct below:
typedef struct {
char word[10];
short a;
int b;
} Test;
I don't understand how C interprets and aligns the char array inside the struct. It should be 9 chars + terminator and it should be regarded as the longest like this:
| - _ - _ - _ - _ - word - _ - _ - _ - _ - |
| - a - | - _ - b - _ - | padding the remaining 4 bytes
The "-" represents a byte and "_" separates the bytes. So we have the 10 bytes long word, the 2 bytes long a and the 4 bytes long b and padding of 4 bytes. But when I print sizeof(Test) it returns 16.
EDIT: I got it.
In a struct like
struct {
char word[10];
short a;
int b;
}
you have the following requirements:
a needs an even offset. As the char arry before it has an even length, there is no need for padding. So a sits at offset 10.
b needs an offset which is dividible by 4. 12 is dividible by 4, so 12 is a fine offset for b.
The whole struct needs a size which is dividible by 4, because every b in an array of this struct needs to have the said requirement. But as we are currently at size 16, we don't need any padding.
WWWWWWWWWWAABBBB
|-- 10 --| 2 4 = 16
Compare this with
struct {
char word[11];
short a;
int b;
}
Here, a would have offset 11. This is not allowed, thus padding is inserted. a is fine with an offset of 12.
b would then get an offset of 14, which isn't allowed either, so 2 bytes are added. b gets an offset of 16. The whole struct gets a size of 20, which is fine for all subsequent items in an array.
WWWWWWWWWWW.AA..BBBB
|-- 11 --|1 2 2 4 = 20
Third example:
struct {
char word[11];
int b;
short a;
}
(note the changed order!)
b is happy with an offset of 12 (it gets 1 padding byte),
a is happy with an offset of 16. (no padding before it.)
After the struct, however, 2 bytes of padding are added so that the struct aligns with 4.
WWWWWWWWWW..BBBBAA..
|-- 10 --| 2 4 2 2 = 20
In:
struct
{
char word[10];
short a;
int b;
}
and given two-byte short and four-byte int, the structure is laid out in memory:
Offset Member
0 word[0]
1 word[1]
2 word[2]
3 word[3]
4 word[4]
5 word[5]
6 word[6]
7 word[7]
8 word[8]
9 word[9]
10 a
11 a
12 b
13 b
14 b
15 b
To get the layout described in the question, where a and b overlap word, you need to use a struct inside a union:
typedef union
{
char word[10];
struct { short a; int b; };
} Test;
Generally, each variable will be aligned on a boundary of its size.
(unless attributes such as packed are applied)
A complete discussion is on Wikipedia, which says in part:
A char (one byte) will be 1-byte aligned.
A short (two bytes) will be 2-byte aligned.
An int (four bytes) will be 4-byte aligned.
A long (four bytes) will be 4-byte aligned.
A float (four bytes) will be 4-byte aligned.
A double (eight bytes) will be 8-byte aligned on Windows and 4-byte aligned on
Linux (8-byte with -malign-double compile time option).
A long long (eight bytes) will be 4-byte aligned.
So your structure is laid out as:
typedef struct {
char word[10];
// Aligned with beginning of structure; takes bytes 0-9
short a;
// (assuming short is 2-bytes)
// Previous member ends on byte 9, this one starts on byte-10.
// Byte 10 is a multiple of 2, so no padding necessary
// Takes bytes 10 and 11
int b;
// Previous member ends on byte 11, next byte is 12, which is a multiple of 4.
// No padding necessary
// Takes bytes 12, 13, 14, 15.
} Test;
Total size: 16 bytes.
If you want to play with it, change your word-array to 9 or 11 bytes,
or reverse the order of your short and int, and you'll see the size of the structure change.
Preface:
Did my research about struct alignment. Looked at this question, this one and also this one - but still did not find my answer.
My Actual Question:
Here is a code snippet I created in order to clarify my question:
#include "stdafx.h"
#include <stdio.h>
struct IntAndCharStruct
{
int a;
char b;
};
struct IntAndDoubleStruct
{
int a;
double d;
};
struct IntFloatAndDoubleStruct
{
int a;
float c;
double d;
};
int main()
{
printf("Int: %d\n", sizeof(int));
printf("Float: %d\n", sizeof(float));
printf("Char: %d\n", sizeof(char));
printf("Double: %d\n", sizeof(double));
printf("IntAndCharStruct: %d\n", sizeof(IntAndCharStruct));
printf("IntAndDoubleStruct: %d\n", sizeof(IntAndDoubleStruct));
printf("IntFloatAndDoubleStruct: %d\n", sizeof(IntFloatAndDoubleStruct));
getchar();
}
And it's output is:
Int: 4
Float: 4
Char: 1
Double: 8
IntAndCharStruct: 8
IntAndDoubleStruct: 16
IntFloatAndDoubleStruct: 16
I get the alignment seen in the IntAndCharStruct and in the IntAndDoubleStruct.
But I just don't get the IntFloatAndDoubleStruct one.
Simply put: Why isn't sizeof(IntFloatAndDoubleStruct) = 24?
Thanks in advance!
p.s: I'm using Visual-Studio 2017, standard console application.
Edit:
Per comments, tested IntDoubleAndFloatStruct (different order of elements) and got 24 in the sizeof() - And I will be happy if answers will note and explain this case too.
On your platform, the following holds: The size of int and float are both 4. The size & alignment requirement of double is 8.
We know this from the sizeof output you've shown. sizeof (T) gives the number of bytes between the addresses of two consecutive elements of type T in an array. So we know that the alignment requirements are as I've said above. (Note)
Now, the compiler reported 16 for IntFloatAndDoubleStruct. Does it work out?
Assume we have such an object at an address aligned to 16.
int a is therefore at address X aligned to 16, so it's aligned to 4 just fine. It will occupy bytes [X, X+4)
This means float c could start at X+4, which is aligned to 4, which is fine for float. It will occupy bytes [X+4, X+8)
Finally, double d could start at X+8, which is aligned to 8, which is fine for double. It will occupy bytes [X+8, X+16)
This leaves X+16 free for the next struct object, again aligned to 16.
So there's no reason to start any of the members later, so the whole struct fits into 16 bytes just fine.
(Note) This is not strictly true: for each of these, we know that both size and alignment are <= N, that N is a multiple of the alignment requirement, and that there is no N1 < N for which this would also hold. However, this is a very fine detail, and for clarity the answer simply assumes the actual size and alignment requirements for the primitive types are indetical, which is the most likely case on the OP's platform anyway.
Your struct must be 8*N bytes long, since it has a member with 8 bytes (double). That means the struct sits in the memory at an address (A) divisible by 8 (A%8 == 0), and its end address will be (A + 8N) which will also be divisible by 8.
From there, you store 2 4-bytes variables (int + float) meaning you now occupy the memory area [A,A+8). Now you store an 8-byte variable (double). There is no need for padding since (A+8) % 8 == 0 [since A%8 == 0]. So, with no padding you get the 4+4+8 == 16.
If you change the order to int -> double -> float you'll occupy 24 bytes since the double variable original address will not be divisible by 8 and it will have to pad 4 bytes to get to a valid address (and also the struct will have padding at the end).
|--------||--------||--------||--------||--------||--------||--------||--------|
| each || cell || here ||represen||-ts 4 || bytes || || |
|--------||--------||--------||--------||--------||--------||--------||--------|
A A+4 A+8 A+12 A+16 A+20 A+24 [addresses]
|--------||--------||--------||--------||--------||--------||--------||--------|
| int || float || double || double || || || || | [content - basic case]
|--------||--------||--------||--------||--------||--------||--------||--------|
first padding to ensure the double sits on address that is divisble by 8
last padding to ensure the struct size is divisble by the largest member's size (8)
|--------||--------||--------||--------||--------||--------||--------||--------|
| int || padding|| double || double || float || padding|| || | [content - change order case]
|--------||--------||--------||--------||--------||--------||--------||--------|
Compiler will insert padding in order to guarantee that each element is at offset that is some multiple of its size.
In this case int will be at offset=0 (relative to address of a structure instance), float at offset=4, and double at offset=8, because sizes of int and float add up to 8.
There's no padding at the end - size of the structure is already 16, which is a multiple of size of double.
Well, after reading this Size of structure with a char, a double, an int and a t I still don't get the size of my struct which is :
struct s {
char c1[3];
long long k;
char c2;
char *pt;
char c3;
}
And sizeof(struct s) returns me 40
But according to the post I mentioned, I thought that the memory should like this way:
0 1 2 3 4 5 6 7 8 9 a b c d e f
+-------------+- -+---------------------------+- - - - - - - -+
| c1 | |k | |
+-------------+- -+---------------------------+- - - - - - - -+
10 11 12 13 14 15 16 17
+---+- -+- -+- - - - - -+----+
|c2 | |pt | | c3 |
+---+- -+- -+- - - - - -+----+
And I should get 18 instead of 40...
Can someone explain to me what I am doing wrong ? Thank you very much !
Assuming an 8-byte pointer size and alignment requirement on long long and pointers, then:
3 bytes for c1
5 bytes padding
8 bytes for k
1 byte for c2
7 bytes padding
8 bytes for pt
1 byte for c3
7 bytes padding
That adds up to 40 bytes.
The trailing padding is allocated so that arrays of the structure keep all the elements of the structure properly aligned.
Note that the sizes, alignment requirements and therefore padding depend on the machine hardware, the compiler, and the platform's ABI (Application Binary Interface). The rules I used are common rules: an N-byte type (for N in {1, 2, 4, 8, 16 }) needs to be allocated on an N-byte boundary. Arrays (both within the structure and arrays of the structure) also need to be properly aligned. You can sometimes dink with the padding with #pragma directives; be cautious. It is usually better to lay out the structure with the most stringently aligned objects at the start and the less stringently aligned ones at the end.
If you used:
struct s2 {
long long k;
char *pt;
char c1[3];
char c2;
char c3;
};
the size required would be just 24 bytes, with just 3 bytes of trailing padding. Order does matter!
The size of the structure depends upon what compiler is used and what compiler options are enabled. The C language standard makes no promises about how memory is utilized when the compiler creates structures, and different architectures (for example 32-bit WinTel vs 64-bit WinTel) cause different layout decisions even when the same compiler is used.
Essentially, the size of a structure is equal to the sum of the size of the bytes needed by the field elements (which can generally be calculated) plus the sum of the padding bytes injected by the compiler (which is generally not known).
It is because of alignment, gcc has
#pragma pack(push,n)
// declare your struct here
#pragma pack(pop)
to change it. Read here, and also __attribute__((__packed__)).
If you declare the struct
struct packed
{
char c1[3];
long long k;
char c2;
char *pt;
char c3;
} __attribute__((__packed__));
then compiling with gcc, sizeof(packed) = 18 since
c1: 3
k : 8
c2: 1
pt: 4 // it depends
c3: 1
Apparently Visual C++ compiler supports #pragma pack(push,n) too.
what is a size of structure?
#include <stdio.h>
struct {
char a;
char b;
char c;
}st;
int main()
{
printf("%ld", sizeof(st));
return 0;
}
it shows 3 in gdb compiler.
#include <stdio.h>
#include <string.h>
typedef unsigned int uint32;
typedef unsigned char uint8;
int main()
{
double a = 1320.134;
uint32 b;
uint8 c[20];
b = (unsigned int)a;
c[3] = b; //c[3] = (unsigned char)b;
printf("value of %c", c[3]);
return 1;
}
I am trying to do some type conversion in my program. Inside the main function- 1: I am converting and store it in a double. 2: I want to store the uint32 value in a character array at third position but I am not able to get the output, if I do as above. Please someone help me on this ??
output: value of c <. //some junk value
how to read the 1320 in c[3] ?
how to read the 1320 in c[3]
There is mathematically no way to read anything larger than 255 from a single unsigned char. The value that you see, 40 (0x28, which represents an opening parenthesis character) is the last eight bits of the 1320 - a result of truncation of 0x528.
use
printf("value of %d", c[3]);
instead of
printf("value of %c", c[3]);
in addition, and 8-bit unsigned integer can be as high as 255. the over part will be cut-off.
Here is the overview of what is happening:
First you set: uint32 b = (unsigned int)(1320.134) which just makes it b = 1320.
1320 is a number that is more that 8 bits so it won't just fit into the 8 bit space (c[3]), so instead it forces it in there and ignores the leftover bits, so instead you get 1320%256 (the % means remainder after division), which happens to be 40. Now, this number, when converted into ascii character form, is then printed out.
REVISION:
So from what I understand, you want there to be some overflow into the next element, so you need to do some complicated pointer work. Here is how I would go about it.
uint32 b = 1320;
uint8 c[20];
//first, get a pointer to the 3rd array element, and convert it to a uint32 pointer:
uint32 *pointer = &c[3];
//now, lets dereference it and put the value into the location:
*pointer = b;
This should give the proper overflow, but im very curious, what could you possibly want to do this for?
Here is how it gets stored by the way:
| | | |40|5 | ...
The reason for this is that the binary representation of 1320 is 10100101000. This is stored backwards in the memory, and the bits end up being placed like this:
| | | |00010100|10100000|
and when read (backwards), 00010100 is equal to 40 and 10100000 is equal to 5
This question already has answers here:
What is the meaning of "__attribute__((packed, aligned(4))) "
(3 answers)
Closed 9 years ago.
The following code;
struct s1 {
void *a;
char b[2];
int c;
};
struct s2 {
void *a;
char b[2];
int c;
}__attribute__((packed));
if s1 has a size of 12 bytes and s2 has a size of 10 bytes, is this due to data being read in 4 byte chunks and }__attribute__((packed)); reduces the size of void*a; to only 2 bytes?
A little confused as to what }__attribute__((packed)); does.
Many thanks
It is due to alignment, a process in which the compiler adds hidden "junk" between the fields to make sure they have optimal (for performance) starting addresses.
Using packed forces the compiler to not do that, which often means that accessing the structure becomes slower (or simply impossible, causing e.g. a bus error) if the hardware has problems doing e.g. 32-bit accesses on addresses that are not multiples of 4.
On Intel processors, the fetches of 32-bit aligned data is considerably faster than unaligned; on many other processors unaligned fetches might be illegal altogether, or need to be simulated using 2 instructions. Thus the first structure would have the c always on these 32-bit architectures aligned to a byte address divisible by 4. This however requires that 2 bytes will be wasted in storage.
struct s1 {
void *a;
char b[2];
int c;
};
// Byte layout in memory (32-bit little-endian):
// | a0 | a1 | a2 | a3 | b0 | b1 | NA | NA | c0 | c1 | c2 | c3 |
// addresses increasing ====>
On the other hand, sometimes you absolutely need to map some unaligned datastructures (like file formats, or network packets), as is, into C structures; there you can use the __attribute__((packed)) to specify that you want everything without padding bytes:
struct s2 {
void *a;
char b[2];
int c;
} __attribute__((packed));
// Byte layout in memory (32-bit little-endian):
// | a0 | a1 | a2 | a3 | b0 | b1 | c0 | c1 | c2 | c3 |
// addresses increasing ====>
This is due to data structure alignment, a combination of two processes: data alignment and data padding. The first structure will be aligned to the word as you said, however the second structure is packed and forces the compiler to not pad the structure to the word.
The second structure is 10 bytes because the character array is 2 bytes, not the void pointer (it remains 4 bytes, as all pointers are). This can hinder performance as the trade off of 2 bytes of space is not worth the efficiency lost by the hardware (under most circumstances) and could lead to undefined behaviour.