How does gcc calculate the required space for a structure? - c

struct {
integer a;
struct c b;
...
}
In general how does gcc calculate the required space? Is there anyone here who has ever peeked into the internals?

I have not "peeked at the internals", but it's pretty clear, and any sane compiler will do it exactly the same way. The process goes like:
Begin with size 0.
For each element, round size up to the next multiple of the alignment for that element, then add the size of that element.
Finally, round size up to the least common multiple of the alignments of all members.
Here's an example (assume int is 4 bytes and has 4 byte alignment):
struct foo {
char a;
int b;
char c;
};
Size is initially 0.
Round to alignment of char (1); size is still 0.
Add size of char (1); size is now 1.
Round to alignment of int (4); size is now 4.
Add size of int (4); size is now 8.
Round to alignment of char (1); size is still 8.
Add size of char (1); size is now 9.
Round to lcm(1,4) (4); size is now 12.
Edit: To address why the last step is necessary, suppose instead the size were just 9, not 12. Now declare struct foo myfoo[2]; and consider &myfoo[1].b, which is 13 bytes past the beginning of myfoo and 9 bytes past &myfoo[0].b. This means it's impossible for both myfoo[0].b and myfoo[1].b to be aligned to their required alignment (4).

There's not truely standardized way of aligning a struct, but the rule of thumb goes like this: The entire struct is aligned at a 4 or 8 byte boundary (depending on the platform). Within the struct, each member is aligned by its size. So the following packs with no padding:
char // 1
char
char
char
short int // 2
short int
int // 4
This will have a total size of 12. However, this next one will cause padding:
char // 1, + 1 bytes padding
short // 2
int // 4
char // 1, + 1 byte padding
short // 2
char // 1
char // 1, + 2 bytes padding
Now the structure takes up 16 bytes.
This is just a typical example, the details will depend on your platform. Sometimes you can tell a compiler to never add any padding -- this cause more expensive memory access (possibly introducing concurrency problems) but will save space.
To lay out aggregates as efficiently as possible, order the members by size, starting with the biggest.

The size of a structure is implementation defined, but it is hard to say what the size of your structure will be without more information (it is incomplete). For instance, given this struct:
struct MyStruct {
int abc;
int def;
char temp;
};
Yields a size of 9 on my compiler. 4 bytes for int and 1 byte for a char.

Have modified your code so that it compiles and ran it on Eclipse/Microsoft C compiler platform:
struct c {
int a;
struct c *b;
};
struct c d;
printf("\nsizeof c=%d, sizeof a=%d, sizeof b=%d",
sizeof(d), sizeof(d.a), sizeof(d.b));
printf("\naddrof c =%08x", &c);
printf("\naddrof c.a=%08x", &c.a);
printf("\naddrof c.b=%08x", &c.b);
The above code fragment produced the following output:
sizeof c=8, sizeof a=4, sizeof b=4
addrof c =0012ff38
addrof c.a=0012ff38
addrof c.b=0012ff3c
Do something like this so you can see (WITHOUT GUESSING) exactly how your compiler formats a structure.

Related

Accessing structure members using base address

Can you please help explain why following program correctly prints the values of all the structure members?
struct st
{
int i;
char c1;
int j;
char c2;
};
int main()
{
struct st a = {5, 'i', 11, 'H'};
struct st * pa = &a;
int first;
char second;
int third;
char fourth;
first = *((int*)pa);
second = *((char*)pa + 4); /* offset = 4 bytes = sizeof(int) */
third = *((int*)pa + 2); /* why (pa + 2) here? */
fourth = *((char*)pa + 12); /* why (pa + 12) here? */
printf ("first = %d, second = %c, third = %d, fourth = %c\n", first, second, third, fourth);
return 0;
}
Output: first = 5, second = i, third = 11, fourth = H
How can I make above program generalized?
That's because of the padding bytes added to the structure. Three padded bytes will be added after char second;, this is because the char is followed by an int (member with larger alignment) so padding bytes will be inserted to make the alignment multiple of the alignment of larger member.
How can I make above program generalized?
The only way to make it work reliably is by not guessing at the offset. Use the standard offsetof macro, and always do the pointer arithmetic with a character pointer:
first = *(int*)((char*)pa + offsetof(struct st, i));
You don't have to name the field at the point you do the access, but you should definitely use the macro to compute the offest if you intend to pass it into your function.
It is because of structure padding.
After padding your structure will look like below.
struct st
{
int i;
char c1;
char padding[3]; // for alignment of j.
int j;
char c2;
char padding[3]; // for alignment of structure.
};
Hence
first = *((int*)pa);
second = *((char*)pa + 4); /* offset = 4 bytes = sizeof(int) */
third = *((int*)pa + 2); /* offset = 8 bytes(pointer arithmetic) to point to int j*/
fourth = *((char*)pa + 12); /* offset = 12 bytes to point to char c2*/
For more info on structure padding read
Data_structure_alignment
As in another answers - padding.
But some compilers allow you to pack your structures removing (in most cases) the padding.
gcc:
struct __attribute__((packed)) st
{
....
}
The code which access the packed structs may be less efficient and longer.
When creating a struct, all variables occupy the same amount of space (32 bits), the remaining unused bits are padding. So even if you define a char in the struct, this will occupy 4 bytes.
This is due to the fact that your processor addresses data at 32 bits, even if afterwards less bits are used. The memory on the other side stores 1 byte for each address, but when data is fetched by the CPU, data will be adapted to the bus architecture (that depends on the processor).
Also note that the offset depends on the pointer you are using. a char* in this case will increase by 1, while a int* by 4.
This also means that the code is not portable, since, for example, int may not be defined of the same size on different architectures.

Struct memory allocation, memory allocation should be in multiple of 4

struct x
{
char b;
short s;
char bb;
};
int main()
{
printf("%d",sizeof(struct x));
}
Output is : 6
I run this code on a 32-bit compiler. the output should be 8 bytes.
My explanation --> 1. Char needs 1 bytes and the next short takes multiple of 2 so short create a padding of 1 and take 2 bytes, here 4 bytes already allocated. Now the only left char member takes 1 byte but as the memory allocates is in multiple of 4 so overall memory gives is 8 bytes.
The alignment requirement of a struct is that of the member with the maximum alignment. The max alignment here is for short, so probably 2. Hence, two for b, two for s, and two for bb gives 6.
The C struct memory layout is completely implementation-specific and you can't assume all of this.
Also, in the typical alignment of C structs a struct like this:
struct MyData
{
short Data1;
short Data2;
short Data3;
};
will also have sizeof = 6 because if the type "short" is stored in two bytes of memory then each member of the data structure depicted above would be 2-byte aligned. Data1 would be at offset 0, Data2 at offset 2, and Data3 at offset 4. The size of this structure would be 6 bytes.
See https://en.wikipedia.org/wiki/Data_structure_alignment

Different Result than calculated,using the SIZEOF operator? [duplicate]

This question already has answers here:
Why isn't sizeof for a struct equal to the sum of sizeof of each member?
(13 answers)
Closed 8 years ago.
I was writing the code to check size of the int ,char and some struct.But its giving different result than manually calculated one.
#include<stdio.h>
struct person
{
int roll;
char name[10];
};
void main()
{
struct person p1;
printf("\n The size of the integer on machine is \t :: %d \n ",sizeof(int));
printf("\n The size of the char on machine is \t :: %d \n ",sizeof(char));
printf("\n The size of structre is \t :: %d \n",sizeof(struct person));
printf("\n The size of structre is \t :: %d \n",sizeof(p1));
}
I think structure shall have size = 10 * 1 + 4 = 14. But the output is
The size of the integer on machine is :: 4
The size of the char on machine is :: 1
The size of structre is :: 16
See what wikipedia says!
To calculate the size of any object type, the compiler must take into account any address alignment that may be needed to meet efficiency or architectural constraints. Many computer architectures do not support multiple-byte access starting at any byte address that is not a multiple of the word size, and even when the architecture allows it, usually the processor can fetch a word-aligned object faster than it can fetch an object that straddles multiple words in memory.[4] Therefore, compilers usually align data structures to at least a word alignment boundary, and also align individual members to their respective alignment boundaries. In the following example, the structure student is likely to be aligned on a word boundary, which is also where the member grade begins, and the member age is likely to start at the next word address. The compiler accomplishes the latter by inserting unused "padding" bytes between members as needed to satisfy the alignment requirements. There may also be padding at the end of a structure to ensure proper alignment in case the structure is ever used as an element of an array.
Thus, the aggregate size of a structure in C can be greater than the sum of the sizes of its individual members. For example, on many systems the following code will print 8:
struct student{
char grade; /* char is 1 byte long */
int age; /* int is 4 bytes long */
};
printf("%zu", sizeof (struct student));
You should try by altering the size of char array in your structure for better understanding
for example:
struct person
{
int roll;
char name[4];
};
Gives answer as 8
struct person
{
int roll;
char name[7];
};
Gives answer as 12
First you need to change %d to %zu because sizeof returns size_t type.
sizeof(p1) is giving 16 bytes instead of 14 because padding 2 bytes are added to it.
struct person
{
int roll; // 4 bytes
char name[10]; // 10 bytes. 2 bytes are needed for structure alignment
};

How to allocate 16byte memory aligned data

I am trying to implement SSE vectorization on a piece of code for which I need my 1D array to be 16 byte memory aligned. However, I have tried several ways to allocate 16byte memory aligned data but it ends up being 4byte memory aligned.
I have to work with the Intel icc compiler.
This is a sample code I am testing with:
#include <stdio.h>
#include <stdlib.h>
void error(char *str)
{
printf("Error:%s\n",str);
exit(-1);
}
int main()
{
int i;
//float *A=NULL;
float *A = (float*) memalign(16,20*sizeof(float));
//align
// if (posix_memalign((void **)&A, 16, 20*sizeof(void*)) != 0)
// error("Cannot align");
for(i = 0; i < 20; i++)
printf("&A[%d] = %p\n",i,&A[i]);
free(A);
return 0;
}
This is the output I get:
&A[0] = 0x11fe010
&A[1] = 0x11fe014
&A[2] = 0x11fe018
&A[3] = 0x11fe01c
&A[4] = 0x11fe020
&A[5] = 0x11fe024
&A[6] = 0x11fe028
&A[7] = 0x11fe02c
&A[8] = 0x11fe030
&A[9] = 0x11fe034
&A[10] = 0x11fe038
&A[11] = 0x11fe03c
&A[12] = 0x11fe040
&A[13] = 0x11fe044
&A[14] = 0x11fe048
&A[15] = 0x11fe04c
&A[16] = 0x11fe050
&A[17] = 0x11fe054
&A[18] = 0x11fe058
&A[19] = 0x11fe05c
It is 4byte aligned everytime, i have used both memalign, posix memalign. Since I am working on Linux, I cannot use _mm_malloc neither can I use _aligned_malloc.
I get a memory corruption error when I try to use _aligned_attribute (which is suitable for gcc alone I think).
Can anyone assist me in accurately generating 16byte memory aligned data for icc on linux platform.
The memory you allocate is 16-byte aligned. See:
&A[0] = 0x11fe010
But in an array of float, each element is 4 bytes, so the second is 4-byte aligned.
You can use an array of structures, each containing a single float, with the aligned attribute:
struct x {
float y;
} __attribute__((aligned(16)));
struct x *A = memalign(...);
The address returned by memalign function is 0x11fe010, which is a multiple of 0x10. So the function is doing a right thing. This also means that your array is properly aligned on a 16-byte boundary. What you are doing later is printing an address of every next element of type float in your array. Since float size is exactly 4 bytes in your case, every next address will be equal to the previous one +4. For instance, 0x11fe010 + 0x4 = 0x11FE014. Of course, address 0x11FE014 is not a multiple of 0x10. If you were to align all floats on 16 byte boundary, then you will have to waste 16 / 4 - 1 bytes per element. Double-check the requirements for the intrinsics that you are using.
AFAIK, both memalign and posix_memalign are doing their job.
&A[0] = 0x11fe010
This is aligned to 16 byte.
&A[1] = 0x11fe014
When you do &A[1] you are telling the compiller to add one position to a float pointer. It will unavoidably lead to:
&A[0] + sizeof( float ) = 0x11fe010 + 4 = 0x11fe014
If you intend to have every element inside your vector aligned to 16 bytes, you should consider declaring an array of structures that are 16 byte wide.
struct float_16byte
{
float data;
float padding[ 3 ];
}
A[ ELEMENT_COUNT ];
Then you must allocate memory for ELEMENT_COUNT (20, in your example) variables:
struct float_16byte *A = ( struct float_16byte * )memalign( 16, ELEMENT_COUNT * sizeof( struct float_16byte ) );
I found this code on Wikipedia:
Example: get a 12bit aligned 4KBytes buffer with malloc()
// unaligned pointer to large area
void *up=malloc((1<<13)-1);
// well aligned pointer to 4KBytes
void *ap=aligntonext(up,12);
where aligntonext() is meant as:
move p to the right until next well aligned address if
not correct already. A possible implementation is
// PSEUDOCODE assumes uint32_t p,bits; for readability
// --- not typesafe, not side-effect safe
#define alignto(p,bits) (p>>bits<<bits)
#define aligntonext(p,bits) alignto((p+(1<<bits)-1),bits)
I personally believe your code is correct and is suitable for Intel SSE code. When you load data into an XMM register, I believe the processor can only load 4 contiguous float data from main memory with the first one aligned by 16 byte.
In short, I believe what you have done is exactly what you want.
you could also used this in VS.
__declspec(align(16)) struct x {
long long a;
long long b;
char c;
};
instead of this
struct x {
float y;
} __attribute__((aligned(16)));

Read binary data (from file) into a struct

I'm reading binary data from a file, specifically from a zip file. (To know more about the zip format structure see http://en.wikipedia.org/wiki/ZIP_%28file_format%29)
I've created a struct that stores the data:
typedef struct {
/*Start Size Description */
int signatute; /* 0 4 Local file header signature = 0x04034b50 */
short int version; /*  4 2 Version needed to extract (minimum) */
short int bit_flag; /*  6 2 General purpose bit flag */
short int compression_method; /*  8 2 Compression method */
short int time; /* 10 2 File last modification time */
short int date; /* 12 2 File last modification date */
int crc; /* 14 4 CRC-32 */
int compressed_size; /* 18 4 Compressed size */
int uncompressed_size; /* 22 4 Uncompressed size */
short int name_length; /* 26 2 File name length (n) */
short int extra_field_length; /* 28 2 Extra field length (m) */
char *name; /* 30 n File name */
char *extra_field; /*30+n m Extra field */
} ZIP_local_file_header;
The size returned by sizeof(ZIP_local_file_header) is 40, but if the sum of each field is calculated with sizeof operator the total size is 38.
If we have the next struct:
typedef struct {
short int x;
int y;
} FOO;
sizeof(FOO) returns 8 because the memory is allocated with 4 bytes every time. So, to allocate x are reserved 4 bytes (but the real size is 2 bytes). If we need another short int it will fill the resting 2 bytes of the previous allocation. But as we have an int it will be allocated plus 4 bytes and the empty 2 bytes are wasted.
To read data from file, we can use the function fread:
ZIP_local_file_header p;
fread(&p,sizeof(ZIP_local_file_header),1,file);
But as there're empty bytes in the middle, it isn't read correctly.
What can I do to sequentially and efficiently store data with ZIP_local_file_header wasting no bytes?
In order to meet the alignment requirements of the underlying platform, structs may have "padding" bytes between members so that each member starts at a properly aligned address.
There are several ways around this: one is to read each element of the header separately using the appropriately-sized member:
fread(&p.signature, sizeof p.signature, 1, file);
fread(&p.version, sizeof p.version, 1, file);
...
Another is to use bit fields in your struct definition; these are not subject to padding restrictions. The downside is that bit fields must be unsigned int or int or, as of C99, _Bool; you may have to cast the raw data to the target type to interpret it correctly:
typedef struct {
unsigned int signature : 32;
unsigned int version : 16;
unsigned int bit_flag; : 16;
unsigned int compression_method : 16;
unsigned int time : 16;
unsigned int date : 16;
unsigned int crc : 32;
unsigned int compressed_size : 32;
unsigned int uncompressed_size : 32;
unsigned int name_length : 16;
unsigned int extra_field_length : 16;
} ZIP_local_file_header;
You may also have to do some byte-swapping in each member if the file was written in big-endian but your system is little-endian.
Note that name and extra field aren't part of the struct definition; when you read from the file, you're not going to be reading pointer values for the name and extra field, you're going to be reading the actual contents of the name and extra field. Since you don't know the sizes of those fields until you read the rest of the header, you should defer reading them until after you've read the structure above. Something like
ZIP_local_file_header p;
char *name = NULL;
char *extra = NULL;
...
fread(&p, sizeof p, 1, file);
if (name = malloc(p.name_length + 1))
{
fread(name, p.name_length, 1, file);
name[p.name_length] = 0;
}
if (extra = malloc(p.extra_field_length + 1))
{
fread(extra, p.extra_field_length, 1, file);
extra[p.extra_field_length] = 0;
}
C structs are just about grouping related pieces of data together, they do not specify a particular layout in memory. (Just as the width of an int isn't defined either.) Little-endian/Big-endian is also not defined, and depends on the processor.
Different compilers, the same compiler on different architectures or operating systems, etc., will all layout structs differently.
As the file format you want to read is defined in terms of which bytes go where, a struct, although it looks very convenient and tempting, isn't the right solution. You need to treat the file as a char[] and pull out the bytes you need and shift them in order to make numbers composed of multiple bytes, etc.
The solution is compiler-specific, but for instance in GCC, you can force it to pack the structure more tightly by appending __attribute__((packed)) to the definition. See http://gcc.gnu.org/onlinedocs/gcc-3.2.3/gcc/Type-Attributes.html.
It's been a while since I worked with zip-compressed files, but I do remember the practice of adding my own padding to hit the 4-byte alignment rules of PowerPC arch.
At best you simply need to define each element of your struct to the size of the piece of data you want to read in. Don't just use 'int' as that may be platform/compiler defined to various sizes.
Do something like this in a header:
typedef unsigned long unsigned32;
typedef unsigned short unsigned16;
typedef unsigned char unsigned8;
typedef unsigned char byte;
Then instead of just int use an unsigned32 where you have a known 4-byte vaule. And unsigned16 for any known 2-byte values.
This will help you see where you can add padding bytes to hit 4-byte alignment, or where you can group 2, 2-byte elements to make up a 4-byte alignment.
Ideally you can use a minimum of padding bytes (which can be used to add additional data later as your expand the program) or none at all if you can align everything to 4-byte boundaries with variable-length data at the end.
Also, the name and extra_field will not contain any meaningful data, most likely. At least not between runs of the program, since these are pointers.

Resources